AMERICAN 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


ABRAHAM COHEN F. D. MURNAGHAN 
THE JOHNS HOPKINS UNIVERSITY THE JOHNS HOPKINS UNIVERSITY 


J. F. RITT 


T. H. H'CDEBRANDT 
COLUMBIA UNIVERSITY 


UNIVERSITY OF MICHIGAN 


R. L. WILDER 
UNIVERSITY OF MICHIGAN 


WITH THE COOPERATION OF 


OYSTEIN ORE E. T. BELL C. R. ADAMS 

H. P. ROBERTSON H. B. CURRY R. D. JAMES 

M. H. STONE E. J. MCSHANE SAUNDERS MACLANE 
T. Y. THOMAS HANS RADEMACHER GABOR SZEGO 

G. T. WHYBURN OSCAR ZARISKI LEO ZIPPIN 


PUBLISHED UNDER THE JOINT AUSPICES OF 
THE JOHNS HOPKINS UNIVERSITY 
AND 
THE AMERICAN MATHEMATICAL SOCIETY 


VOLUME LXIlI 
1940 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U, A: 


| 

| 


\ 
| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES.* 


By Marston Morse and Gustav A. HEDLUND. 


1. Introduction. In a recent paper’ we initiated a theory of symbolic 
dynamics. In this theory we consider unending sequences of symbols or 
symbolic trajectories and devote attention to those properties of symbolic 
trajectories which are suggested by dynamical considerations. A symbolic 
trajectory is formed from symbols taken from a finite set of generating symbols 
subject to certain rules of admissibility. In SD admissibility conditions were 
formulated of such generality that the resulting symbolic trajectories include 
in particular those which arise in the geodesic problem on surfaces which 
satisfy the condition of uniform geodesic instability. 

However, no surface of the topological type of a torus satisfies the con- 
dition of uniform geodesic instability and the admissibility conditions of SD 
do not include those which arise in the case of the torus. 

In the present paper we consider a class of symbolic trajectories formed 
from two generating symbols subject to admissibility conditions defined by a 


simple comparison property. These are the symbolic trajectories which char- 


acterize the geodesics on a flat torus. They may be used to characterize the 


distribution of the zeros of the solutions of a differential equation of the form 
y”’ + f(x)y =0, where f(x) is a periodic function of x We term the tra 
jectories of this class Sturmian. A first fundamental result is as follows: 


Sturmian trajectories possess certain numerical characteristics, namely, 
a frequency, a pole, and a type index, and admit mechanical constructions 


uniquely determined by these characteristics. 


There are three types of Sturmian trajectories,—irrational, skew and 
The trajectories of irrational type are recurrent but not periodic; 


periodic. 
The recurrency function of a recur- 


those of skew type are not recurrent. 
rent Sturmian trajectory is completely determined by the frequency @ of the 
trajectory and may be denoted by R(n,a). We introduce the variable 

=a(1+ a)". Let C./D, be the convergents in .a continued fraction repre- 


sentation of y. We have the following fundamental theorem: 


*Received June 19, 1939. 
*Cf. Morse and Hedlund. (References will be found in the bibliography at the end 


of the paper.) This paper will hereafter be referred to as SD. Numerous references 


will be found in the bibliography at the end of SD. 
1 


2 MARSTON MORSE AND GUSTAV A. HEDLUND. 


When «@ is irrational, R(n,«) increases by unity when n increases from 
n—1 to n except whenn = D,,v=0,1,---. For these exceptional values 


of n we have the relation 
k(D, a) = + 2D, —1 (v = 0, 1, 2,- 
starting with v=1 in the special case Dy) = D,. 


The preceding theorem thus gives a simple mode of evaluating R(n, «) 
when @ is irrational. Previous to this the only non-periodic recurrent tra- 
jectory of which the recurrency function had been determined was the Morse 
recurrent trajectory (cf. SD §8). 

The evaluation of R(n,«) permits various extensions of our knowledge of 
recurrency functions. In particular we are able to solve one of the problems posed 
at the end of SD. We had shown in SD § 7 that if R(m) is the recurrency func- 


tion of a general non-periodic recurrent trajectory, lim inf R(n)/n = 2. The 


results of the present paper show that the constant 2 cannot be replaced by a 
greater constant. 

The proper choice of « yields a recurrency function R(n,«) such that 
5+ V5 


R(n, < 


with R(n,«) becoming infinite more slowly than for any other previously 
known non-periodic trajectory. A final result on the asymptotic behavior of 
R(n,«) is as follows. Let ¢(a) be a positive monotonically increasing func- 
tion of w defined for « > 0. As n becomes infinite the lim. sup. of 
R(n, «) 
np (log n) 
is finite or infinite for almost all values of « according as the series 
1 
n=l p(n) 
converges or diverges. In particular the lim. sup. of 


R(n, «) 
n log n 


is infinite for almost all values of n while the lim. sup. of 


R(n, 


n (log n)° (c > 1) 


is zero for almost all values of «. 


| 
| 
| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 3 


The results of this paper and its predecessor will presently be given its 
appropriate dynamical setting (cf. G. D. Birkhoff) in terms of trajectories on 
a space form. 

[. CLASSIFICATION AND REPRESENTATION. 

2. The comparison condition and general theorems. We shall consider 

sequences 1 of two symbols a and b of the forms 


(2.1) - 

(2. 2)’ 

(2. 3)” + 

(2.3) aBa:: aBa (r= 28), 


in which B, is a finite block of b’s. We admit that B, may be the null set. 
We term B, the cell of X of index n, and term X a cell-sequence. A cell- 
sequence -\ of the form (2.1), (2.2) or (2.3) will be respectively termed a 
cell-series, a cell-beam or a chain. A chain which contains n cells will be 
termed an n-chain. If the chain (2.3) appears in X it will be termed the 
chain [r,s] of X. 

Two cell-series (cell-beams) X and Y will be regarded as identical if and 
only if the cells of X and Y have the same index range and if cells with the 
same index are identical. On the other hand, two n-chains of the form 

QBg,0 aBy at, 


will be regarded as identical if and only if 
Bout = (1, = 1, 

The number of symbols 6 in an n-chain «& will be called the b-length of z. 
We shall be concerned with cell-sequences \ which satisfy the following 
condition. 

C. Under Condition C the b-lengths of any two n-chains of X with the 
same n shall differ by at most one. 

We term this condition the comparison condition. Cell-sequences which 
satisfy the comparison condition will be called Sturmian. As we shall see, 
Sturmian sequences appear in the theory of linear second order differential 
equations. 

THEOREM 2.1. The b-lengths bm and by of arbitrary m- and n-chains 
of a Sturmian chain satisfy the relation 


(2. 4) +1) > m(brn— 1). 


— 


4 MARSTON MORSE AND GUSTAV A. HEDLUND. 


We shall give an inductive proof of the theorem, first noting that (2. 4) 
holds when m =n=1. Let N be any positive integer at most the maximum 
of m and n in the theorem. We assume the truth of (2.4) for n and m both 
less than NV, and shall prove that (2.4) holds when m and n are at most N. 
Since (2.4) holds for m =n, there remain two cases to consider. 


Case I. N=m>n. We here set m=pn-+q withOSq<n. An 
m-chain x may be regarded as a sequence of p successive n-chains followed by 
a q-chain, two successive chains having a symbol a in common. The b-lengths 
of our p n-chains are each at least b, —1 so that 
(2.5) bm = p(bn—1) + bg 
where bg is the b-length of the final qg-chain of z We are assuming that 
(2.4) holds when m and n are both less than JN, so that 


(2.6) n(bq +1) > q(bn—1). 
Adding 1 to both members of (2.5) and multiplying by n we find that 
(2.7) + 1) = np(bx—1) + + 1). 


Upon using (2.6), (2.7) takes the form (2.4) and the proof is complete 

in Case I. 
Case II. N=n>m. Setn=pm+qwithhOSq<m. Essentially 

as in Case I we find that 

(2. 8) bn S p(bm + 1) + ba, 

where by, is the b-length of an arbitrary n-chain y and bg is the b-length of 

the final g-chain of y. By virtue of our inductive hypothesis 

(2. 9) m(bg—1) < q(bm +1). 


Subtracting 1 from both members of (2.8), then multiplying by m and using 
(2.9), relation (2.4) results again as in Case I. 
The proof of the theorem is complete. 


THEOREM 2.2. If by ts the b-length of an arbitrary n-chain of a Stur- 
mian beam or series, then b»/n tends to a finite limit « as n becomes infinite. 


It follows from (2.4) that 


and the theorem follows directly. 

We term «a the frequency of the Sturmian beam or series. When «= 0 
we set 8 = 1/a and term £ the corresponding rotation number. When « = 0, 
B shall be « by convention. 


| 


SYMBOLIC DYNAMICS IT. STURMIAN TRAJECTORIES, 5 
When X is a Sturmian chain we shall set 
b 1 : b 
(2. 10) a’ max | ——— |, a” == min | — + = | 
m m m 


bm ranging over all b-lengths of m-chains of 1. 


THEOREM 2.3. (a) A necessary and sufficient condition that a cell- 
sequence T be Sturmian is that there exist a constant a=0 such that the 
b-lengths by of n-chains of T satisfy one of the two following sets of conditions 


for each n: 
(2. 11)’ 


(b) If T is a Sturmian chain, conditions (2.11) are satisfied tf and only if 
a is on the interval o Sa a”, the right and left equalities prevailing in 
(2.11) at most when a=” and @ respectively. (c) If T is a Sturmian 
beam or series, (2.11) is satisfied if and only if « ws the frequency. 

The conditions (2.11)’ or the conditions (2.11)” are sufficient that T 


be Sturmian since there are at most two integral values of by» which satisfy 
(2.11)’ or (2.11)” respectively for a given n, and these integral values differ 


by at most one. 
To prove the conditions (2.11) necessary we suppose 7’ Sturmian. 
We begin with the case in which 7’ is a chain. It follows from (2.4) that 


(2 12) 


so that a’ < @”. Moreover a’ = 0 except in the trivial case in which bm = 0 
for each m. It is easily seen that (2.11) holds for o& <a<a@”. For in 


such a case 


(2.13) 
Similarly 


For & <a<«@”’, (2.11) thus holds with the equalities excluded. It is also 
clear from (2.13) and (2.14) that (2.11)’ holds when «=a and (2.11)” 
when a=” but that (2.11)’ does not hold in general when a=” nor 
(2.11)” when «= 

The preceding analysis includes a proof of (b), as well as a proof of the 
necessity of (2.11) when T is a finite chain. 


| 


6 MARSTON MORSE AND GUSTAV A. HEDLUND. 


We come to the case in which 7 is a Sturmian beam or series with fre- 
quency a That bm = ma-+ 1 follows at once from (2.12) upon letting n 
become infinite. Similarly we see from (2.12) that b, = na — 1 upon letting 
m become infinite. The conditions (2.11) taken as a whole are accordingly 
satisfied by 7. But it is impossible that bm = ma-+1 and bra = na—1 for 
the same beam or cell-series 7. For in such a case we find that 


m(bn +1) =n(bm—1), 


contrary to (2.4). Thus conditions (2.11) are necessary in one of the two 
forms. 

That (2.11) holds at most when @ is the frequency follows upon dividing 
the respective members of (2.11) by » and letting n become infinite. 

The proof of the theorem is complete. 


3. The classification of Sturmian beams and series according to fre- 
quencies. Let f& be a Sturmian beam with a rational frequency # When 
a >0 we set «—4gq/p where q¢ and p are relatively prime integers. When 
== ( we understand that g = 0 and p=—1. 


Lemma 3.1. In a Sturmian beam R with a rational frequency q/p, 
there cannot exist two p-chains with b-lengths different from q, with different 
cell indices. 


The lemma is illustrated by the Sturmian series 
(3.1) + -@aB_,aB 


in which the cell-series obtained by omitting Boa is periodic with cell lengths 
elternatingly 2 and 3, starting with b(B,) = 3. The b-length of Bo shall be 3. 
Here p= 2, qg=5. The 2-chains in general have the b-length q=5. But 
the 2-chain aB,aB,a has the b-length 6. 

We come to the proof of the lemma. 

The b-lengths by of p-chains of FR satisfy (2.11)’ or (2.11)”. We con- 
sider the case in which (2. 11)’ is satisfied. Then bp must be q or gq +1. 

We suppose the lemma false. There then exist two p-chains x and y of 
R with b-lengths ¢g+ 1 and with different cell indices. Without loss of 
generality we can suppose that x precedes y in R. Let w be the m-chain of RF 
with z as its initial p-chain and y its terminal p-chain. We distinguish two 
cases: Casel. m=2p; Case Il. p< m < 2p. 


Case I. Let z be the subchain of w whose cells are not cells of 2 or y, 
and suppose that z is an r-chain. We understand that r may be zero. Let b, 
and bm be the b-lengths of z and w respectively. We have m = 2p-+r and 


@ 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 


bm = by + 2q¢ +2. Upon applying (2. 11)’ to and to bm respectively, we 
find that 


(3.2) 
p p 
(3. 3) (ptr) + 425 


Upon subtracting 2q from each member of (3.3) we see that (3.2) is satis- 
fied with b, replaced by 6, + 2. Since this is impossible we infer that Case I 


is impossible. 


Case II. Let z be the subchain of w whose cells occur in both a and y, 
and suppose that z is an r-chain. Let b, and bm be the b-lengths of z and w 
respectively. We have m = 2p—r and bm =2q +2—l,. Upon applying 
(2. 11)’ to b, and bm respectively, we obtain (3.2) and the relation 


(3.3)! (2p—r) < 2— by S (2p—r) 


Upon formally adding the respective members of (3.2) and (3. 3)’ we obtain 
the relation 
—2<252, 


from which we infer that the equality holds in (3.2). Hence r must be a 
multiple of p. But this is impossible if p< < 2p. Thus Case II is equally 
impossible. 

The case where (2.11)” holds is similarly treated, and the proof is 
complete. 

A Sturmian beam or series will be said to have the cell-period p if its 
cells satisfy the relation 
(3. 4) Bisp = Bi 


for each admissible 7. 

THEOREM 3.1. A periodic Sturmian series T or beam R with rational 
frequency «= 4q/p, where*® (q,p) =1, has the minimum cell-period p. The 
b-lengths by of its n-chains satisfy the condition 
(3. 5) 
assuming each integral value bn which satisfies (3.5). 

The p-chains of 7 have the constant b-length g. Otherwise there would 
be infinitely many p-chains with different cell-indices and with b-lengths 
different from g contrary to Lemma 3.1. Hence 7 has the cell-period p. 


2The notation (q,p) =1 shall mean that q and p are relatively prime integers. 


| 


8 MARSTON MORSE AND GUSTAV A. HEDLUND. 


Let s be an arbitrary cell-period of T and let r be the b-length of an 
s-chain of 7. Then 


This is possible only if s is a multiple of p. Hence p is the minimum cell- 
period of 7. 

That (3.5) is satisfied will follow from (2.11) once the equality signs 
are excluded from (2.11). But an equality can hold in (2.11) only if na 
is an integer or zero, and this implies that n is a multiple of p. When n= rp, 
bn =1q since a p-chain of 7’ has the b-length q, and we conclude that bn = 12. 
Hence the equality never prevails in (2.11) and (3.5) holds as stated. 

To see that b» assumes each integral value which satisfies (3.5) we first 
note that when n is a multiple of p, na is the only value of by» which satisfies 
(3.5). There remains the case where 1 is not a multiple of p. But if for 
the given n, b, had but one value, n would be a period of 7, and hence a 
multiple of p, contrary to hypothesis. Hence b, assumes each integral value 


which satisfies (3.5). 
A similar proof applies to beams. 
The proof of the theorem is complete. 
Sturmian series with irrational frequencies will be termed trrational. 


THEOREM 3.2. The b-lengths by» of the n-chain of an irrational Stur- 


mian series with frequency « salisfy the condition 
(3. 6) 
assuming each integral value bn which satisfies (3.6). 


The numbers by satisfy (2.11)’ or (2.11)” as we have seen. But when 
@ is irrational the equality can never prevail in (2.11). Moreover, for each 
n, bn assumes the two values defined by (3.6). Otherwise 7’ would have the 
cell-period n, and hence a rational frequency. 

The proof of the theorem is complete. 

Sturmian series which have rational frequencies but which are not periodic 
will be termed skew. The appropriateness of the term will appear later. An 
example of a skew Sturmian series has already been given in (3.1). Another 
example 7* will be given here. To define 7* we first define a cell-series 7. 
The beam of Z whose first cell is Bm shall have the cell-period 5. The b-lengths 
of its cells shall have a period block 21211. The beam of Z whose final cell 
is Bm-_, shall also have the cell-period 5. The b-lengths of its ceils shall have 
a period block 11212. The 6-lengths of cells of Z thus form a sequence 


f 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 9 


+ + (11212) (11212) (21211) (21211) ---. 


To obtain 7* from Z we replace Bm in Z by a cell of b-length 1. The series 
T* is seen to be Sturmian. Its frequency is 7/5. 

To analyze skew Sturmian series we introduce several terms. An n-chain 
of a Sturmian series 7’ whose b-length is the maximum or minimum among 
b-lengths of n-chains of T’ will be said to be of maa or min type respectively. 
A cell B, of 7 will be said to be of max or min type if the chain aB,a is of 
max or min type respectively. 

It follows from Lemma 3.1 that in any skew Sturmian series 7’ with 
frequency q/p there exists one and only one p-chain whose b-length is different 
from q. This chain will be called the critical chain of T. 


THEOREM 3.3. Let T bea skew Sturmian series with critical p-chain C. 
The beam following (preceding) the initial (final) cell of C has the cell- 
period p. The initial cell B of C ts identical with the final cell of C while 
the cells immediately preceding and following C are identical and opposite in 
type to B. 

Let X and Y be respectively the beams preceding the final and following 
the initial cell of C. The beams X and Y have the cell-period p since each 
of their p-chains has the b-length q. 

Suppose for simplicity that C is of max type. Let B’ be the terminal 
cell of C. The beams 
(3. 7) aBX, 
are not periodic for their terminal p-chains C have b-length q¢-+ 1. All the 
p-chains in the two beams (3.7) will have the b-length q provided their 
terminal cells are reduced by a unit in b-length, following which reduction 
both beams have the cell-period p. The cells thereby replacing B and B’ have 
copies in 7’ and must be of minimum type. Hence B and B’ are of maximum 
type and identical. 

That the cell preceding (following) C in 7’ is of type opposite to B 
follows from the fact that the beams (3.7) are not periodic. 

The case where C is of minimum type is similarly treated. 

THEOREM 3.4. The b-length bn of the n-chains of a skew Sturmian 
serles T with frequency « satisfy one of the conditions (2.11). Condition 
(2.11)’ {(2.11)”} is satisfied if the critical chain is of max type {min type}. 
The integers b, assume all integral values satisfying (2.11)’ {(2.11)”}. 

That the numbers bn satisfy one of the conditions (2.11)’ or (2.11)” 
follows from Theorem 2.3. Let us suppose that the conditions (2.11)’ are 
satisfied. Since 7 is not periodic, b» must assume two distinct values for each 


10 MARSTON MORSE AND GUSTAV A. HEDLUND. 


positive integer n. Corresponding to a given integer n there are only two 
integral values which satisfy (2.11)’ and it follows that b, must assume all 
integral values which satisfy (2.11)’. If «—4q/p, where (q,p) =1, then 
bp = or g+1. The critical p-chain of T must have b-length q + 1 and 
hence is of max type. 

Analogous arguments hold if the integers by, satisfy (2.11)” and the 
proof of the theorem is complete. 


ConpITION A. A set of m-chains will be said to satisfy Condition A if, 
n being any positive integer not exceeding m, the b-lengths of the sub n-chains 
of the given set of m-chains assume at most two values. 


Lemma 3.2. If a set of m-chains salisfies Condition A, the number of 
chains in the set cannot exceed m + 1. 


If X and Y are m- and n-chains respectively, XY shall mean the (m + 2)- 
chain whose first m-chain and last n-chain are X and Y respectively. 

The lemma is obviously true if m1. We assume the lemma true for 
integers not exceeding m—1 and prove that the lemma holds for the integer 
m>1. If the lemma were not true there would exist a set of m + 2 different 


m-chains satisfying Condition A. Let us denote such a set by 


C1, C2, 


By the hypothesis of the induction, this set contains at most two different 
1-chains B and B* and therefore the set can be written in the form 


(C) C; D,B,, C mse 
where Bi, i—1,2,: --,m-+2, is either B or B* and the chains of the set 
(D) D,, Dz, Dinse 


are (m—1)-chains. By the hypothesis of the induction, there can be at most 
m different (m—1)-chains in the set (D). Since the members of the set (C) 
are assumed to be all different, there cannot be three identical (m— 1)-chains 
in the set (D). It follows that there are at least two pairs in the set (D) such 
that members of the same pair are identical. We can assume the notation so 


chosen that 
dD, — Dz, Di, 


C; DB, C, D,B*, Cs C, — D,B*. 


Since the members of the set (C) are all different, it follows that D, and D,; 
must differ in some cell and can be written in the form 


dD, — E.B;F;, dD; E;B;F,, 


{ 
| 
| 
( 
| 
i 


SYMBOLIC DYNAMICS If. STURMIAN TRAJECTORIES. 11 


where /7,, F;, FH; are chains, while one of the pair B,, B; is B and the other 


is B*, The chains 


are subchains of the given set of m-chains. These four chains contain the 
same number of cells and their b-lengths assume three different values. From 
this contradiction we infer the truth of the lemma. 


LeMMA 3.3. Corresponding to a gwen constant «= 0 the set of Slur- 
mian chains satisfying (2.11)’ {(2.11)”} contains at most n +1 different 


n-chains. 


For the set of Sturmian n-chains satisfying (2.11)’ {(2.11)”} is a set 
satisfying Condition A and it follows from Lemma 3.2 that there can be at 
most » + 1 different n-chains in the set. 

Let 7 be a Sturmian series and r an integer, positive, negative or zero. 
The Sturmian series 7” which results upon adding r to the index of each cell 
of T will be said to be similar to T. We write T’~ T. 


THEOREM 3.5. A periodic Slurmian serves T {beam R} with frequency 
a—=q/p, where (q,p) =1, contains n+ 1 different n-chains if 0<n< p 
and p different n-chains if n= p. Two periodic Sturmian series with the 


same frequency are similar. 


The series 7’ has the minimum cell-period p (cf. Theorem 3.1). It fol- 
lows that if n < p, T must contain at least n+ 1 different n-chains, for 
otherwise (cf. SD § 7; the arguments given in SD concern blocks, but similar 
arguments apply to chains) 7’ would have a cell-period less than p. The 
b-lengths of the n-chains of 7’ satisfy one of the conditions (2.11) and we 
infer from Lemma 3.3 that 7’ contains at most n + 1 different n-chains. Thus 
T contains n+ 1 different n-chains if 0<n< p. Since the number of 
different n-chains of 7’ is a non-decreasing function of n, T contains at least 
p different n-chains if n= p. The periodicity of T implies that 7 contains 
at most p different n-chains. 

Let 7 and YT” be periodic Sturmian series with the same frequency 
a=q/p, =1. The b-lengths of the n-chains of 7’ and T” satisfy 
(3.5) and hence (2.11)’. It follows from Lemma 3.3 that the totality of 
(p—1)-chains in 7 and 7” form a set containing at most p different (p— 1)- 
chains. But it has been shown that each of the cell-series 7’ and T” contains p 
different (—1)-chains and we infer that 7 and 7” contain the same (p—1)- 
chains. In particular, 7 and 7” contain identical (p—1)-chains and since 
the b-length of any p-chain of 7 or T” is q, it follows that T and T” contain 


| 

| 

| | 


12 MARSTON MORSE AND GUSTAV A. HEDLUND. 


identical p-chains. Since T and 7” are periodic with cell-period p, they are 


similar. 
The proof of the theorem is complete. 


THEOREM 3.6. A skew Sturmian series U contains n+ 1 different 
n-chains for every positive integer n. Two skew Sturmian series with the 
same frequency and with critical chains of the same b-length are similar. 


Let n be a positive integer. Since U is not periodic it must contain at 
least n + 1 different n-chains and it follows from Theorem 3.4 and Lemma 
3.3 that U contains exactly n + 1 different n-chains. 

Let U and V be skew Sturmian series with the same frequency « = q/p, 
where (q, p) = 1, and with critical p-chains of the same length. The critical 
p-chains of U and V are either both of b-length q+ 1 or both of b-length 
q—1. In the former case U and V satisfy (2.11)’ and in the latter (2.11)”. 
It follows from Lemma 3.3 that U and V contain the same n-chains and in 
particular the same critical p-chains. By virtue of the relation of a skew 
Sturmian trajectory to its critical chain, as given in Theorem 3. 3, we infer 
that U and V are similar. . 

Let X be a cell-series and let Y be the cell-series obtained from X by 
inverting the order of the indices of the cells of X. We term Y the inverse 
of X. If X is a Sturmian series, the inverse of Y satisfies Condition C and 
hence is Sturmian. If the inverse Y of a Sturmian series X is similar to X, 


we term X symmetric. 
CoroLuary. A skew Sturmian series is symmetric. 


For if X is a skew Sturmian series, its inverse Y is evidently skew Stur- 
mian with frequency equal to that of X and with critical chain of b-length 
equal to that of the critical chain of XY. It follows from Theorem 3.6 that 
Y is similar to X and hence X is symmetric. 


THEOREM 3.7. Two irrational Sturmian trajectories with the same fre- 
quency contain the same n +-1 different n-chains for each posilwe integer n. 


Let T and 7” be irrational Sturmian trajectories with the same frequency 
a. Since neither 7 nor 7” is periodic, each must contain at least n+ 1 
different n-chains for each positive integer n. Since T and T” have the same 
frequency @ the b-lengths of the n-chains of both T and T” satisfy (3.6) and 
hence (2.11)’. We infer from Lemma 3.3 that the totality of n-chains in 
both 7 and 7’ cannot consist of more than n + 1 different n-chains. It fol- 
lows that 7 and 7” contain the same n + 1 different n-chains. The proof of 
the theorem is complete. 


ir 


f 
| 
| 
4 
0 
H 
i 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 13 


4, Mechanical sequences. Let « be a positive real number and c an 
arbitrary real number. On the real axis — 0 <4<-+ o we introduce the 
set of points 


(4. 0) 28, 6— £,¢,¢-+- ¢-+ - (8 =1/a). 


We term c the pole of this set of points. 

Let T'(c, a) {7’(c,«)} denote the cell-series of the form (2.1) in which 
the 1-th cell B; contains as many 6’s as there are points (4.0) in the interval 
(tc rsi+l}. The b-length of an n-chain of T(c,a) or 
T’(c,%) is either sn or Ss, + 1, where 


(4.1) n = SnB + Tn < B). 


It follows that 7'(c,«) and T’(c,«) are Sturmian series. Observe that 


1 Sn Tn Sn 
(=: +- =lim —, 
so that « is the frequency of both T'(c,a) and T’(c, a). 

When « = 0 we understand that (4.0) is a null set of points. The corre- 
sponding cell-series contains no b’s and will be denoted by either T'(c,0) or 
(é, 9). 

If c, = c, mod B the corresponding sets (4.0) are identical and 7'(¢,, «) 
T’(¢,,%) =T’(c.,a). The set of points congruent to B 
will be denoted by P(x). The domain of P(x) will be regarded as a circle T. 
The function P(x) maps the z-axis onto the circle [. In this map the image 
on I of a neighborhood of a point ¢ on the z-axis will be regarded as a neigh- 
borhood of P(c) on T. The circle T will be taken in the sense which cor- 
responds locally to the sense of increasing c. The interval PQ on T, PAQ, 
shall mean the segment of [ which begins with P and ends with Q, taking 
in its positive sense, and including P but not Q. When P = Q, the interval 
PQ shall be the whole of T. We term I the f-circle. 


Lemma 4.1. Jf P(r) AP(n+1) and «>0, an m-chain [r,n] of 
T'(c,«) is of max or min type respectively according as P(c) is on the 
interval P(r)P(n + 1) or the complementary interval P(n + 1)P(r) of the 
B-circle. If P(r) =P(n +1) all m-chains have the same b-length. 


This follows at once from the conventions upon noting that the type 
of an m-chain [r,n] of T'(c,a) decreases when P(c) leaves the interval 
P(r)P(n+ 1) and remains invariant as P(c) varies on this interval, or its 
complement. In particular, suppose s is an integer between r and n+ 1 
inclusive. Suppose P(r) AP(n+1). Then P(s) AP(s +1) whatever 


14 MARSTON MORSE AND GUSTAV A. HEDLUND. 


the integer s. As P(c) enters an interval beginning with P(s), P(c) vary- 
ing in the positive sense on T the cells Bs, and Bs, change their types to min 
and max, respectively. 

We define the alternate interval PQ, P ~Q, of T as the segment of T 
which begins with P and ends with Q, taking T in its positive sense, and 
including Q but not P. When P= Q, the alternate interval PQ shall be 
the whole of T. The proof of the following lemma is analogous to that of 
Lemma 4, 1. 


Lemma 4.1’. Read Lemma 4.1 with T(c,«) replaced by T’(c,%) and 
with the term interval replaced by alternate interval. 


TuHEorEM 4.1. If a is irrational, two series T(c,a) and T(a, @) 
{T’(c,«) and T’(a, «)} are identical if and only if c=amod B. 

If @ is irrational, the points P(n), (n=1,2,°° +), are everywhere 
dense on the f-circle T. If mod P(c) ~AP(a). There accordingly 
exist integers r and n with r<m such that P(c) lies on the interval 
P(r)P(n +1) of T while P(a) lies on the complementary interval. It 
follows from Lemma 4.1 that the chains [r,n] of T’(c,«) and T(a,a) are 
different. If a=c mod B, T(a,a) =T(c,a). The proof of the theorem 
is complete for the Sturmian series T'(c,a) and T(a,«). 

The proof of the theorem for the series T’(c,«) and 7”’(a,a) is similar. 

The residue intervals. Suppose «—gq/p with q>0, p>O, and 
(q,p) =1. Let m be an arbitrary integer. Then 


qm =sp+r, pz, 
where s and r are integers. Hence 


It follows that the numbers r/q with r= 0,1,- -, p—1, form a complete 
set of residues mod £ of the rational integers, and the point set P(n) on the 
B-circle T reduces to the set of p points P(r/q). The latter set is identical 
with the set of points 

(4. 2) P(0),P(1),° P(p—1) 


since no two integers for which 0 = n = p—1 are congruent mod £. 

The points (4.2) divide the B-circle T into p successive intervals termed 
residue intervals if the initial but not the terminal point is included in an 
interval, and the alternate residue intervals if the terminal but not the initial 
point is included in an interval. 

The following theorem is an easy consequence of Lemmas 4. 1 and 4. 1’. 


f 
{ 
| 
a 
i 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 15 


THEOREM 4.2. When «@ is positive and rational, two cell-series T(c, «) 
{T’(c,a)} are identical and only the corresponding points P(c) le on 
the same residue interval {alternate residue interval}. 


THEOREM 4.3. When @ is irrational, T(a,«) =T’(c,a) if and only if 
mod B and mod B where m is an integer. If a=c=m mod B£, 
m an ‘ntoger, the corresponding cells of T(a,«) and T’(c, a) are equal except 
that Bm and By, are of max and min type respectively in T(a,a) and of 


opposite types in T’(c, a). 


If cm mod £, where m is an integer, the number of points of the set 
(4.0) in the interval 1+ = a2<%1+1 is identical with the number in the 
interval i< a@Si-+1. Thus the cell By of T'(c,«) is identical with the 
cell B; of T’(c,a) and T(c,a) =T’(c,a). If a=-mod 8 it follows from 
Theorem 4.1 that 7'(a,%)=T(c,a) and hence 7'(a,a) =T’(c,«@). 

Conversely, let us assume that T'(a,%) =T”’(c,a). Since @ is irrational 
the points P(n), n =1,2,° + +, are everywhere dense on the f-circle T and 
by arguments similar to those given in the proof of Theorem 4.1, it is easily 
shown that mod If c=~m mod the interval 
contains one more point of the set (4.0) than does the interval 
m—1S2< m, namely the point m. It follows that the cell Bm_, is of min 
type in 7'(a,«%) and of max type in T’(c,a). Similarly, the cell Bm is of 
max type in 7'(a,«) and of min type in T’(c,a). Thus if T'(a, a) =T(c, a) 
we must have a==c ~™m mod 

The second statement of the theorem follows readily. 

The following theorem is easily derived with the aid of Theorem 4. 2. 


THEOREM 4.4. When « is positive and rational, the cell-series T (a, «) 
and T’(c,«) are identical if and only if the residue interval in which P(a) 
lies, coincides, except for end points, with the allernate residue interval in 


which P(c) les. 


TueorEM 4.5. If is irrational, T(c,%) ~ T (a, «){T’(¢, ~T’(a, «)} 
if and only if c=a+ pB + q, where p and q are integers. 


If c—a+p8+q it follows from Theorem 4.1 that T(c, @) 
=T(a+q,%). But the chain [r,s] of 7’(a,«) is identical with the chain 
[r+qs+4q] of T(a+ q,«), independently of the values of r and s, and 
hence ~T(a+q,a) =T(c, 2). 

To prove the converse we assume that 7'(c,a) ~T(a,a). It follows 
that there exists an integer q such that the cell B; of T'(c,%) is identical 
with the cell Bi of T(a,«) for all integral values of 1. The cell By of 


16 MARSTON MORSE AND GUSTAV A. HEDLUND. 


T(a+q,%) is identical with the cell Bi of T(c,a). Thus T(c,a) 
=T (a+ q,%) and we infer from Theorem 4.1 that c=a-+q, or 
c=a-+ pB-+q, where p and q are integers. 

A similar proof applies to the pair 7’(c,a) and T’(a,a). The proof of 


the theorem is complete. 


THEOREM 4.6. If @ is rational the cell-series T(c,«) and T’(c,«) are 
periodic and any two of these cell-series are similar, whatever the value of c. 


The periodicity of these cell-series is evident. They are all Sturmian 
series with the same rational frequency a. It follows from Theorem 3. 5 that 
any two of these cell-series are similar. The proof of the theorem is complete. 

If « is irrational and c4m, mod f where m is an integer, the cell-series 
T(c,«) and T’(c,a) are identical. ‘Thus the class of cell-series 7'(c, «) 
corresponding to a given value of % includes most of the cell-series T’(c, @). 
However, as stated in the following theorem, there are exceptions. 


THEOREM 4.7. If « is irrational and c=m, mod B, where m is an 
integer, the cell-sertes T’(c,a) is not similar to any cell-series T (a, a). 


Let us suppose that 7’(c,«) and 7’(a,a) are similar. It follows that 
there exists an integer g such that T(a-+ q,a) =T’(c,«). We infer from 
Theorem 4.3 that cs4m, mod £, where m is an integer. From this contra- 
diction we infer the truth of the theorem. 

The cell-series S(m,a) and S’(m,a). The preceding cell-series T'(c, «) 
and 7”’(c,«) include no skew Sturmian series. To obtain such cell-series we 
introduce new mechanical sequences as follows. Let c be a rational integer m 
and «@ positive and rational. In S(m,«a) the number of b’s in By shall equal 
the number of points of the set (4.0) on the intervals 


according asn <m,n=—=m,orn>m. In S’(m,a) the number of b’s in Bn 
shall equal the number of points of the set (4.0) on the intervals 


according asn<m,n=—=m,orn>m. As a special convention we under- 
stand that S’(m,0) shall consist of null cells except that Bm shall be 0b. 
S(m,0) will not be defined. 


THEOREM 4.8. The cell-sequences S(m,a) and S’(m,a) are skew 
Sturmian with a critical chain of min and max type respectively. The cell 
Bm is the initial cell of the critical chain of S(m, a) {S’(m, «)}. 


fi 


— 


SYMBOLIC DYNAMICS IT. STURMIAN TRAJECTORIES. 17 


In case « = 0, all the cells of S’(m, a) are null save Bm and the theorem 
is obvious. We accordingly assume that « > 0. 

To establish the theorem it is sufficient to show that S {S’} is Sturmian. 
To that end let 6 be the b-length of an n-chain of § 
B,. Then @ is the number of points of the set (4.0) on an interval I(S) 


{1(S’)} of length n beginning at the point sr. Various cases arise accord- 


{S’} whose initial cell is 


ing to the nature of n according as J includes one, both, or neither of its end 
points. We represent n in the form 


N == 3 +- Tn, < B, 


and distinguish between the two following cases. 


Case I. 1,40. Tere J contains at most one of the points (4.0) as an 
end point and @=s, or s,-+-1. The comparison Condition C is accordingly 
satisfied. 


Case Il. t=0. When rs4m mod £, no point of (4.0) is at an 
end point of J and 6=s,. When r=m mod 8, there are points of (4.0) at 
both points of J(S) {1(S’)}. We see that 6s, or s,—1 in S and sp or 
Sn +1 in S’. The Condition C is accordingly satisfied. 

Thus S and S’ are Sturmian. They have the frequency a They are not 
periodic by virtue of the definition of Bm. Moreover S and S’ are skew 
Sturmian with critical p-chain [m,m-+ p—1]. For the b-lengths of this 
p-chain in S(m,«) or S’(m, a), respectively, are the number of points of the 
set (4.0) on the intervals 


m<au<m- gB, gp, (p = ¢qB), 


and in either case are different from q, the length of every other p-chain. 
Hence Bm is the initial cell of the critical chain of S or 8’. 


5. On the representation of Sturmian series by mechanical sequences. 
The mechanical sequences are Sturmian. We show conversely that any Stur- 
mian series is identical with a properly chosen mechanical sequence. 


THEOREM 5.1. A periodic Sturmian series U with frequency « is identi- 
cal with T(c,«) for suitable choice of c. 


If « = 0 the only symbol appearing in U and T(c, 0) is a and the theorem 
is evident. 

We assume where (q,p) =1. According to Theorem 
4.6, T'(c,«) is a periodic Sturmian series. Since U and 7'(c,«) are periodic 
Sturmian series with the same frequency, we infer from Theorem 3.5 that U 


18 MARSTON MORSE AND GUSTAV A. HEDLUND. 


and T'(c,«) are similar. The series U has the cell-period p and thus there 
are at most p different Sturmian series which are similar to U and such that 
no.two are identical. According to Theorem 4. 2, the cell-series T’'(c,%) and 
T'(a,%) are identical only if the points P(c) and P(a) lie on the same resi- 
due interval. Since there are p residue intervals corresponding to «= q/p 
it follows that there are p cell-series 7'(c,%), no two of which are identical. 
Since all of these cell-series are similar to U, we infer that one of them is 
identical with U. The proof of the theorem is complete. 


THEOREM 5.2. A skew Sturmian series U with frequency « is identical 
with one of the cell-series S(n,a) or S’(n,a) for suitable choice of the 
integer n. 

In the case « = 0 the cell-series U is evidently identical with S’(m, 0). 

We assume « = g/p > 0, where (q,p) =1. Let U be a skew Sturmian 
series with frequency «, with critical chain of min type and with Bm as the 
initial cell of its critical chain. Then U and S(m,a) are skew Sturmian 
series with the same frequency and with critical chains of the same b-lengths. 


‘It follows from Theorem 3.6 that U and S(m,«) are similar. According to 


Theorem 4. 8, B,, is the initial cell of the critical chain of S(m,a). It follows 
from Theorem 3.3 that S(m,a) and U are identical. 

Similar arguments show that if the critical chain of U is of max type, U 
is identical with S’(m,«) for suitable choice of m. 


Lemma 5.1. An m-chain B whose n-chains satisfy the condition 
(5.1) na—1<bn< na+1 
for some « = 0 has a copy in the cell-series T(c,«) for each value of c. 


The case « = 0 is trivial since = 0 for each n when «= 0. 

If « is irrational the cell series T'(c,«) contains m + 1 different m-chains 
(cf. Theorem 3.7) whose n-chains satisfy (5.1) (cf. Theorem 3.2). If the 
m-chains of a set of m-chains satisfy (5.1), they satisfy (2.11)’ and are 
Sturmian. It follows from Lemma 3.3 that the number of different m-chains 
in such a set cannot exceed m +1. Thus 7'(c,«) contains all m-chains whose 
n-chains satisfy (5.1) and in particular T'(c,) contains B. 

If «—q/p~0, where (q,p) =1, arguments similar to those of the 


irrational case apply if m <p. It follows from (5.1) that all p-chains of 


B have b-length q; B is periodic with cell-period p and is completely deter- 
mined by its initial (p—1)-chain B* and the integer g. Since T(c, q/p) 
contains all the p possible (7 —1)-chains whose n-chains satisfy (5.1) (cf. 
Theorem 3.5 and proof) it contains B*. But T(c,q/p) is periodic with cell- 


fl 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 19 


period p and each of its p-chains is of b-length q. It follows that T'(c, q/p) 
contains B. 
The proof of the lemma is complete. 


THEOREM 5.3. A Sturman series U with irrational frequency a ts 
identical with T'(c,a) or T’(c,%) for at least one value of c. 

Let [r,s:a] denote the chain [r,s] of T(a,«) and let [r,s:a]’ denote 
the chain [r,s] of T’(a,a). If n is any positive integer it follows from 
Lemma 5. 1 that the chain [— n, n] of U is identical with a chain [—n + pn, 
n+ pn:c] of T(c,a) and hence that [—n,n] is identical with the chain 
[—n,n:dn] of T(adn,%) where dn pn. The points P(an) of T have 
a cluster point P(a) and we can assume the sequence ni, (11, 2,° °°), so 
chosen that the points dn,,@n,.,° °° vary in one sense on the z-axis and 
approach «=a as a limit point. With increasing i, the points 
(5. 2) Gn, — MB,* Ong — B, Gn, On, + On, + mB, 
approach the points 
(5.3) 
respectively, either from the right or from the left. If a=k mod B for no 
integer /, no point of the set (5.3) is integral. For a given m and 1 suffi- 
ciently large the chain [— m, m: dn,] is identical with the chain [— m, m:a]. 
If i is also chosen so large that mS mj, the chain [—m,m] of U and the 
chain [— m, m:4n,] are identical. It follows that the chain [— m] of U 
is identical with the chain [— m,m:a] of T(a,a) for every positive integer 
m and hence U is identical with T'(a, a). 

If a=k mod B, where k& is an integer, and the points (5.2) approach 
the points (5.3) from the right, the chains [— m, m:an,] and [— m, m: a] 
are again identical for fixed m and for sufficiently large 1. Again U is identical 
with T(a,~). 

If a=k, mod B, where & is an integer and the points (5.2) approach 
the points (5.3) from the left it is easily seen that the chain [— m, m: dn, ] 
and the chain [— m,m:a]’ of T’(a,«) are identical for fixed m and for 1 
sufficiently large. In this case the chain |[— m, m] of U is identical with the 
chain [—m,m:a]’ of T’(a,a) for every positive integer m. But this 
implies the identity of U and T’(a, «). 

The proof of the theorem is complete. 

6. The continuation of Sturmian series. A Sturmian n-chain which 
appears as a subchain of a Sturmian series 7’ will be said to admit T as a 


Sturmian continuation. 


20 MARSTON MORSE AND GUSTAV A. HEDLUND. 


THEOREM 6.1. Each Sturmian chain x admits aleph continuations. 


By virtue of Theorem 2.3 there exists an open interval of values of x 
such that the n-chains of « satisfy the relations (5.1). It follows from 
Lemma 5.1 that for each such value of « there exists a constant ¢ such that 
T(c,«) contains a copy of a. If «2 the cell series T(c,x) and T(a, 2’) 
are not identical so that there are aleph Sturmian continuations of .r. 

A Sturmian beam whose cells B; are respectively identical with the cells 
B’;, with the same index in a Sturmian series 7’ will be said to admit J as a 
Sturmian continuation. 


THEOREM 6.2. Hach Sturmian beam R with irrational frequency x 
admits at least one and at most two- Sturmian continuations. In the case 
where R admits different continuations these continuations are identical 
respectively with the cell-series T(m,«) and T’(m, a) for a suitable choice of 
the integer m. 


If the Sturmian beam F# with irrational frequency admits a Sturmian 
continuation, we infer from Theorem 5.3 that this continuation is identical 
with one of the cell series T'(c, a) or T’(c,«) for a suitable choice of c. The 
Sturmian beam F does not admit two distinct Sturmian continuations of the 
form T'(c,«) and T(a,a). For T(c,%) and T'(a,«) would then have a com- 
mon beam and it would follow as in the proof of Theorem 4.1 that 
c==amod f, and hence T(c, a) =T(a,a). Similarly R admits at most one 
Sturmian continuation of the form T’(c, «). 

Essentially as in the proof of Theorem 5.3, so here it follows that R 
possesses at least one continuation of the form 7'(c, a) or T’(c,2). If T(c, «) 
and T’(a,a) are different Sturmian continuations of 2, it would follow as in 
the proof of Theorem 4.1 that a=cmodf. But since T(c,#) AT’(a, «) 
we conclude from Theorem 4.3 that c==m mod where m is a suitably 
chosen integer. But then 7T(c,a) and 7’(a,a) are identical with T'(m, ~) 
and 7’(m,a) respectively. The proof of the theorem is complete. 

THEOREM 6.3. A non-periodic Sturmian beam R with rational frequency 
a admits a unique Sturmian continuation. 


Without loss of generality we can assume that F is of the form 


If «—4q/p, where (q, p) =1, it follows from Lemma 3.1 that there exists 
cne and only one p-chain of & whose b-length is different from q. We term 
this chain the critical chain C of R. Exactly as in the proof of Theorem 3. 3, 
so here it follows that the beam following the initial cell B’m of C and the 
chain preceding the terminal cell of C are periodic with cell-period p. 


| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 21 


The 6-length of the n-chains of R satisfy one of the conditions (2.11), 
say (2.11)”. The b-lengths of the n-chains of S(m, a) also satisfy (2. 11)” 
and it follows as in the proof of Theorem 3. 6 that the critical p-chains of R 
and S(m,«) are identical. That their cells have the same indices follows 
from our choice of m. In view of the relation of S(m, «) to its critical p-chain 
as disclosed in Theorem 3.8 and the relation of F to its critical p-chain, we 
can affirm that 2 appears as a beam of S(m, ) and of no other skew Sturmian 
series. 

In the case where F satisfies (2.11)’ similar arguments apply and show 
that S’(m,«) is the unique continuation of R. 

The proof of the theorem is complete. 


THEOREM 6.4. A periodic-Sturmian beam R with rational frequency 
z= > 0 admits three dissimilar Sturmian continuations of which one is 
periodic, and two are skew with critical chains of different type. If «= 0, 
R admits two Sturmian continuations of which one is periodic and the other 


1s skew. 


The proof of Theorem 3.5 shows that if two periodic Sturmian beams 
have the same frequency « = q/p where (q, p) =1, they have the same cell- 
period p and contain the same p-chains. If « > 0, each of the cell-series 
T(c,«%), S(m,%) and S’(m,«) contains a periodic beam similar to Rk. If ¢ 
and m are suitably chosen, each of these cell-series will be a continuation of RP. 
The cell-series 7'(c,a), S(m,a) and S’(m,«) are dissimilar and it follows 
from Theorems 3.5 and 3.6 that any other Sturmian series with frequency 
is similar to one of these. 

If 2 = 0 it is easily seen that 7'(c,0) and S’(m, 0), where m is a suitably 
chosen integer, are the only dissimilar Sturmian continuations of 7. 

The proof of the theorem is complete. 

We distinguish between 7'(c,a) and T’(c,«) by assigning a lype-indexr 
+-1 or —1 respectively to these series. We can similarly assign a type-index 
1 or —1 to the series S(m,@) and S’(m,«) respectively. Thus every Stur- 
mian series 7’ possesses a frequency, at least one pole and a type-index. As we 
have seen, 7’ admits a mechanical continuation uniquely determined by these 
numerical characteristics. 

In Part II a class of similar Sturmian series will be called a Sturmian 
trajectory. We note that the members of such a class admit the same numerical 
characteristics. 

II, THe RecurRENCY FUNCTION. 

7. Sturmian trajectories and rays. We return to the concept of tra- 

jectories and J-trajectories of SD, using the preceding symbols a and 6 as 


| 
. 


22 MARSTON MORSE AND GUSTAV A. HEDLUND. 


generating symbols. Recall that an I-trajectory A is an indexed sequence of 


the form 


in which the symbol ¢c; is a or b. The class of I-trajectories “ similar ” to A 
is a trajectory 2 represented by A. 

Let x be an m-block of Q. The number of symbols a or b in « will be 
termed the a-length or b-length respectively of x and written a(z) or b(z). 
Corresponding to the comparison condition of § 2 we here introduce the fol- 


lowing condition. 


S. Under Condition S the a-lengths (b-lengths) of two m-blocks with 
the same m shall differ by at most one. 


A trajectory whose blocks satisfy Condition S will be termed a Sturmian 
trajectory. 

Prior to the present section we have been considering Sturmian series. 
Such series are sequences in which the cells are indexed rather than the sym- 
bols a and b. They are accordingly logically distinct from indexed trajectories. 
In the latter the individual symbols a and b are indexed. Each Sturmian 
series 7’ however defines a trajectory 2 consisting of the symbols a and b 
appearing in 7 ordered as in T. We shall say that Q is represented by T. 

In any Sturmian trajectory at least one of the symbols a or 6 appears 
infinitely many times preceding and following any given symbol. If in par- 
ticular the symbol a does not so appear, the trajectory 7’ must have one of the 
two following special forms: 

(7. 2) * 


These trajectories will be called b-trajectories. 

The trajectories defined by Sturmian series always include infinitely many 
a’s and so never include the b-trajectories. More precisely, we have the 
following theorem. 


THEOREM 7.1. The trajectories defined by Sturmian series satisfy Con- 
dition S and include all such trajectories except the b-trajectories. 


Let T be a Sturmian series and let © be the trajectory defined by T. Let 
a and y be arbitrary m-blocks of 2. Let wu be the chain of maximum a-length 
in a, and v a chain of minimum a-length in 7 containing y. We see that 
a(v) Sa(y) + 2, and that a(w) =a(a). If it were true that 


(7. 4) a(y) +2%Sa(z), 


| 
| 
| 
| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 23 


it would follow that a(v) Sa(u) and there would be a subchain of u with 
the a-length of v. We could then infer from Condition C that 


(7.5) b(v) Sd(u) +1. 
From (7.4) and (7.5) we find that 
a(y) + b(v) Sa(z) + b(u) — 1. 
sut this is impossible since 
m Sa(y) + d(v), a(x) + b(u) Sm. 


Relation (7.4) is accordingly false. We infer | a(z) —a(y)| <1. Upon recall- 
ing that x and y have the same length m we conclude that | b(x) — b(y)| =1. 
The trajectory Q thus satisfies S. 

Conversely, let Q be an arbitrary Sturmian trajectory which is not a 
b-trajectory. It is clear that there are no unending sequences of b’s in Q. 
The symbols b of © can therefore be grouped into maximal blocks of symbols 
h each preceded and followed by a symbol a. There accordingly exists a cell- 
sequence 7’ whose symbols a and 6 appear in 7’ in their order in Q. It remains 
to prove that the chains of 7’ satisfy Condition C. 

Let « and y be two s-chains of T. If b(y) < b(#) —1, the subblock of 
x obtained by dropping the two terminal a’s of « would contain a subblock z 
of the length of y. Then a(y) —a(z) = 2, contrary to the fact that  satis- 
fies Condition S. Hence b(y) = —1. Similarly = b(y) —1. 
Thus 7’ satisfies Condition C,. 

We have seen that a Sturmian trajectory 2 which is not a b-trajectory is 
representable by a Sturmian series 7. The n-chains of 7 will be termed 
n-chains of 2. It is clear that the class of n-chains of Q is independent of 
the choice of the Sturmian series 7’ representing @. 

A non-special Sturmian trajectory 2 will be said to have the frequency z 
of any Sturmian series 7 representing Q. It is clear that 2 is independent 
of the choice of Sturmian series 7’ representing 2. A special Sturmian tra- 
jectory will be said to have the frequency = 

A Sturmian trajectory Q defined by an irrational or skew Sturmian series, 
respectively, will be termed irrational or skew. The special trajectory (7.3) 
will also be termed skew Sturmian. Sturmian trajectories defined by periodic 
Sturmian series are periodic in the sense of SD. They include all periodic 
Sturmian trajectories except (7.2). 

A trajectory © is recurrent if corresponding to any positive integer n 
there exists an integer m such that each m-block of 2 contains a copy of every 
n-block of Q. The least such value of m is called the n-th recurrency index 


24 MARSTON MORSE AND GUSTAV A. HEDLUND. 


R(n) of Q and the function k(n) is termed the recurrency function of Q (cf. 
SD, p. 827). It is clear that a skew Sturmian trajectory is not recurrent. 
We shall show that all other Sturmian trajectories are recurrent. 

The Sturmian trajectory ---bbb--- has the recurrency function 
R(n) =n. Any other periodic Sturmian trajectory has a finite rational 
frequency « and is represented by 7'(0,%). It is clear that Q is recurrent. 
The recurrency function of 2 depends merely on @ and will be denoted by 
R(n,@). 

To show that irrational Sturmian trajectories are recurrent we shall need 


the following lemma. 


Lemma 7.1. If Q is a Sturmian trajectory with frequency « and Q is 
a limit trajectory of Q, then Y is a Sturmian trajectory with frequency «. 

Since ’ is a limit trajectory of 2, every block of 2’ appears in. Hence 
satisfies Condition S and is Sturmian. Every n-chain of 9’ appears in Q, 
and since the definition of the frequency « leaves the choice of n-chains arbi- 
trary subject to the condition that n become infinite, it appears that 2 and 
have the same frequency. The proof of the lemma is complete. 

Now consider a Sturmian trajectory 2 with irrational frequency « Any 
limit trajectory 2’ of O is Sturmian and has the frequency «. It follows from 
Theorem 3. 7 that 2 and 9’ contain the same chains and hence the same blocks. 
The permutation number P(n) of 2 (cf. SD, §6) is accordingly identical 
with that of 0’, so that © is a minimal trajectory. A minimal trajectory is 
recurrent as stated in Theorem 7.2 of SD. Finally, the recurrency function 
of © depends only on a For the chains and blocks of Q are exactly those 
of 7(0,%) so that the recurrency function of Q is that of 7'(0,%). We thus 
have the following theorem. 

THEOREM 7.2. Any Sturmian trajectory with irrational frequency is 
recurrent with a recurrency function R(n,«) uniquely determined by «a. 


8. The derivation of Sturmian trajectories. Let © be a Sturmian 
trajectory represented by a cell-sequence 


(8. 1) 

Corresponding to 2 we introduce a new trajectory 9’ with an indexed 
representation 


defined as follows. Let ci; =a if B; is of minimum type, and let c; = 5b if 
B; is of maximum type. If all cells B; are of the same type let c; =a for all i. 
The trajectory 0’ will be said to be derived from © and the I-representation 


f 
| 
4 | 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 25 


(8.2) of Q’ will be said to correspond to the representation (8.1) of 2. We 
proceed with a proof of the following theorem. 


THEOREM 8.1. Let Q’ bea trajectory derived from a recurrent Sturmian 
trajectory Q with a frequency a The trajectory Q’ is Sturmian and has a 
frequency 


(8. 3) 
where? wo [a]. 


Let / be the number of b’s in a cell of © of minimum type. Suppose Q 
represented by (8.1). Let x be an arbitrary n-chain of (8.1) and let y be 
the corresponding n-block of (8.2). Let bn denote the b-length of x and let 
nq and n, denote the a-length and b-length respectively of y. Each symbol a 
in (8.2) corresponds to a cell of (8.1) of b-length k, and each symbol b of 
(8.2) corresponds to a cell of (8.1) of b-length & +1. Hence 


bn = kng + 
But nq + n, =n so that bn may be given the forms 
(8. 4) bn = (kK +1)n—m = kn + nm. 


Since (8.1) is a Sturmian series, b, varies by at most one for different 
n-chains x of (8.1). Hence the values of nq {ny} differ by at most one for 
different n-blocks y of (8.2). Thus 9’ is Sturmian. 

It remains to evaluate the frequency @ of 9’. First observe that the 
symbol a occurs infinitely many times preceding and following each symbol of 
(8.2) since @ is recurrent. Suppose the block y of (8.2) is an m-chain. 
Then m = mq— 1 and 


a = lim — = lim as lim — 
M n>00 Na 1 Na 
Upon making use of (8.4) we see that 
— kn a—k 


a’ == lim 


noc (kK +1)n— da (k+1)—a@’ 
If « is not an integer it follows from (2.11) that & is the least integer 
such that 


Hence k [a]. If @ is an integer each cell of 2 has the b-length « and 
==a%==[]. Hence (8.3) holds as stated. 


* [a] is the maximum integer not exceeding a. 


| 
| 

| 


26 MARSTON MORSE AND GUSTAV A. HEDLUND. 


Corotiary. If Q is periodic or irrational, then Q is respectively periodic 
or irrational. 

For 9 is recurrent and 2’ is rational or irrational according as 2 is 
rational or irrational. 

Let 2 be a recurrent Sturmian trajectory with frequency z and let 9’ be 
the trajectory derived from 2. Since Q is recurrent it has the following 
property of chain recurrence. Given any positive integer n there exists an 
integer m such that every m-chain of 2 contains a copy of every n-chain of 0. 
The least such integer m will be denoted by p(n,a@) and termed the chain 
recurrency function of Q. If # is the frequency of 9’ it is clear that 


p(n, %) = R(n, 2’) =R (» 


1—o 


where = a%—[a]. In particular, if 0=¢< 1, then [2] and o—z 
so that 


(8. 5) p(n, a) (0<2<1). 
Upon setting 

(0S2<1) 
we find that 

(0<8< x) 
so that (8.5) takes the form 
(8. 6) R(n,8) =e (0=8< 


The recurrency function R(n,8) of recurrent Sturmian trajectories will 
accordingly be known. once we have determined the chain recurrency function 
p(n, %) for 0=a< 1. We proceed with a study of chain recurrency functions. 


9. The determination of p(m,) in terms of the functions E(c, ~) 
and I(n,a). The function p(n, «) is the chain recurrency function of T(c, «) 
and is independent of c. We shall suppose that @ is irrational inasmuch as 
the recurrency function of a trajectory with period w is o+n—1 forn Zo. 

We introduce the points 


(9.0) P(e +1),P(c+2),° P(e +n), (n=1) 


on the B-circle T. These points are all distinct since @ is irrational. They 
determine n non-overlapping intervals PQ on I, where in the special case 
n=1, PQ and PQ is the whole of fT. (For the conventions concerning 
intervals PQ see §4.) Let this set of intervals be denoted by /(c, n, 2). 


T 
(! 


4 | 
| 
| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. rd 


Since the set (9.0) can be obtained from a similar set with ¢ replaced by ¢’ 
by a rotation of I into itself, the lengths of the shortest and longest intervals 
of I(c,n,«) are independent of ¢ and will be denoted by /(n, %) and L(n, a) 
respectively. When n is infinite in (9.0) the points (9.0) are everywhere 
dense on I and it follows that 


lim =lim L(n,«) = 0. 
n->0O n—>0O 


Let « be a constant such thatO Ce. Let E(e, a) be the least integer 
m such that the maximum of the lengths of intervals of J(c, m, «) is at most e. 
It is clear that /’(«,a) is independent of c in (9.0). We term H(e,a) the 
ergodic function belonging to a. 

Recall that [7,s:c] denotes the chain [r,s] of 7'(c, a). 

LEMMA 9.1. A set of n-chains 

contains all n-chains of T(c,«) af and only if there is a point of the set P(ai), 
(1=1,---,k), in each of the intervals of the set 1(0,n +1, @). 

An arbitrary n-chain [r,s] of 7T’(c,) is identical with the chain [1,7] 
of 7’ (c—r,«). Hence the n-chains of T'(c,«) are found among the chains 
[1,] of T(a,a) for suitable choices of a. Observe that two n-chains [1, n] 
of T(a,a) and 7'(a’,a) will be identical if corresponding subchains have the 
same type. Lemma 4.1 gives the conditions under which two such subchains 
are of the same type, stating these conditions in terms of the intervals of T 
defined by the points 

+, Pta+1). 
Lemma 9.1 follows from Lemma 4. 1. 

THEOREM 9.1. The chain recurrency function p(n,«) of a recurrent 
Sturmian trajectory has the value 
(9. 1) p(n, a) = E[l(n +1, +n—1. 

We shall begin by proving the following: 

(a) If m is an integer which equals the right member of (9.1), then 
any m-chain « of T(c,%) contains every n-chain of T(c, «). 

The m-chain «x is identical with an m-chain [1,m:a] for a suitable 
choice of a. Since m =n, the chain [1, m:a] contains the n-chains 

[1,2:a], 
These chains are respectively identical with the chains 


(9. 2) [1,n:a], [1,n:a—1],---, [1,n:a—m-+n]. 


28 MARSTON MORSE AND GUSTAV A. HEDLUND. 


According to Lemma 9.1, the set (9.2) contains all n-chains of T'(c, «) 
provided there is a point of the set 
(9.3) P(a), 


in each of the intervals of 1(0,n + 1, ~). 

By hypothesis in (a) 

a] 

and it follows from the definition of /(e,«) that the maximum length of the 
non-overlapping intervals on I defined by the points (9.3) is at most 
l(n+ 1,2). But the length of the shortest of the intervals /(0, + 1, a) 
is 1(n + 1,@) and we conclude that there is a point of the set (9.3) in each 
of the intervals of J(0,n +1,a). The set (9.2) and hence [1, m:a] and z 
contain each n-chain of 7'(c,a). The proof of (a) is complete. 

It follows from (a) that when m equals the right member of (9.1) 


p(n,%) Sm. Hence 


(9. 4) p(n, %) 1,2), a] +n—1. 
We shall now suppose that m is an integer such that 
(9. 5) 0<m—n+1< L[l(n+ 1,2), 2] 


and show that there exists an m-chain of T'(c,«) which does not contain 
every n-chain of T'(c,«). 

When (9.5) holds there is an interval A(a) of T of length /(4 -'- 1,2) 
containing none of the points (9.3). By choosing a properly, the interval 
A(a) can be brought into coincidence with any given interval of length 
I(n 1,%) of [. There is an interval A* of length 1,«) in the set 
I(0, 4- 1,%) and we can assume a so chosen that the interval A(a) coincides 
with A*. But then the set (9.2) of n-chains does not contain all n-chains 
of T'(c,%) so that the m-chain [1,m:a] does not contain all n-chains of 
T(c,%). But the chain [1,m:a] is identical with an m-chain of T'(c, «), 
for the cell-series T'(c,%) and 7'(a,«) contain the same set of m-chains. 
Thus, if (9.5) holds, there is an m-chain of 7’'(c,«) which does not contain 


all n-chains of T(c,«). We conclude that 
(9. 6) p(n, a) = E[l(n+1,2),¢] +n—1. 
The theorem follows from (9.4) and (9.6). 


10. The evaluation of the recurrencv function of Sturmian tra- 
jectories. According to (8.6) the recurrency function R(n, «) of a recurrent 
Sturmian trajectory with frequency « > 0 is the chain recurrency function 
of a Sturmian trajectory with frequency «(1 -+-«)+—y. As given by (9.1), 


| 


| 
I 
a 
be 
in 
ar 
re 
(1 
i As 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 29 


the chain recurrency function p(n, y) is completely determined by the func- 
tion E[l(n + 1,7), 7]. ‘We shall show that the latter function bears a simple 
relation to the denominators Dy of the successive convergents of the continued 
fraction representing y. 
According to its definition 1(n,#), n = 1, & irrational, is the length of 

the shortest of the intervals of I determined by the points 
(10. 0) 
Thus /(n + 1, «), where n = 1, is the length of an interval P(c + 7)P(¢ + j) 
of where 1 j and and j lie between 1 and n+ 1 inclusive. But the 
length of such an interval is |s—rB|~0, where s=|t—j| and r is a 
properly chosen integer. Thus 

I(n+1,a) =|s—-rB | 
where s is an integer such that 0 < sn. Conversely, if s is an integer such 
that 0 <s=n and r is any integer, then either the length of the interval 
P(e +1)P(e+s+1) or that of its complement on T does not exceed 
|s—rB | and hence 

Hence we have the following lemma. 


LemMA 10.0. The function I(n +1, %) ts the least positive value of 
1 

|s—rp|=—-|se—r| (a> 0) 
1% 


as r ranges over all integral values and s ussumes the values 1,2,° --°, 7. 
We are concerned with the behavior of R(n,a) for large values of n. 
If «= q/p where (p,q) =1, the Sturmian trajectory has the period p+ q 
and 
R(n,4) =p+tq+n—1, n=pt 


We turn to the case in which @ is irrational. Let 
[ Do, bo, a 


be the development of « as a continued fraction (cf. Perron, p. 39). The 
integers 6; are uniquely determined by « and with the possible exception of bo 
are positive. The successive convergents Av/Bv, v= 0, of « are determined 
recursively by the formulas 


§ A 1 = Ao Ay bv Ar 25 (vy = 1 


(10. 1) 


As is well known, the integers Ay and By are relatively prime and 


| 
| 


30 MARSTON MORSE AND GUSTAV A. HEDLUND. 


(10. 2) 
Set 
My, = By — Arf. 

LemMMa 10.1. Corresponding to a given irrational a, 1(n + 1, ts con- 
stant on each wmterval of the form By. =n < By, (v >) and there has the 
value | My.|. 

If n is an integer satisfying the conditions of the lemma and s is an 
integer such that 0< sn, it follows that 0<s< By. Recall that 
(Av, By) =1, so that r/s AAv/By no matter what the integral choice of +. 
It follows from a theorem of Lagrange (cf. Perron, p. 52) that 


(10. 3) ~|sa—r|22| By Meal. 


It follows from Lemma 10. 0 that 
The function 1(n + 1, ) decreases monotonically with n so that for n = Bv.,, 
I(n+1,«) + 1,2). 
By virtue of Lemma 10.0, 1(Bv..+1,%) does not exceed the value of 
|s—rB| when s= By_, and r= Ay-, so that 
l(n+1,¢) S| 


The proof of the lemma is complete. 
Recall that L(n, «) is the maximum length of the non-overlapping inter- 


vals of T determined by the points (10.0), and that it is independent of c 


in (10.0). 

Lemma 10.2. For « trrational and v > 0 
(10. 5) L(Bv..+ S| 
(10. 5)’ L(Bv.. + By—1,2) 


except when v=1 and Bo = 

The left member of (10.5) is the maximum length of the non-overlap- 
ping intervals of T determined by the points 
(10. 6) P(1), P(2),- -, P(Bv-1 + By). 
To prove (10.5) it is sufficient to show that each point of the set (10.6) is 
followed on T by a point of (10.6) at a distance not exceeding | My_.|. We 


distinguish two cases according to the sign of M,-1. 


Case I. My. > 0. It follows from Perron, p. 42, that 


| 
| 
4 be 
cl 
in 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 31 


On the axis of reals the point By, + 7% follows the point i at a distance By-. 
But = mod and lies between 0 and so that P( By. + 7) 
follows P(i) on T at a distance equal to My_,. Since My, > 0, it follows from 
Perron, p. 42, that My les between 0 and — 8 so that the point P(t) follows 
P(By +i) on T at a distance equal to | My | < My1. Thus each point of the 
set (10.6) is followed on I‘ by a point of (10.6) at a distance not exceeding 
My... The proof of (10.5) is complete in Case I. 


Case II. My..< 0. Arguments similar to those given in Case I show 
that each point of the set (10.6) is preceded (and hence followed) by a point 
of (10.6) at a distance not exceeding | Mv. |. 

The proof of (10.5) is complete. 

We turn to the proof of (10.5)’. It follows from Lemma 10. 1 that 


l( By, a) = | My, | = By, + a) 

so that there is no point of the set 
(10. 7) P(1),P(2),- -,P(By—1) 
in the interior of an interval I of length 2 | Mv. | with P(By) as midpoint. 
Both end points of J cannot belong to the set (10.7). For if this were the 
case and these end points were P(t) and P(j) where 1Si<jSBvy—1, 
then 

mod p. 
But this is impossible if « and hence £ is irrational. Thus it is clear that 
there is an interval 7* of T of length exceeding | Mv-.|, with P(By) as one 
of its end points, and with no point of the set (10.7) or P(By) in its interior. 


Consider the set 
(19. 8) P(Bv), P( By +1),° + By—1). 


There are By_, points in this set and if vy —1, or if vy=2 and B, = B, — 1, 
the set consists of the single point P(By) which is not in the interior of J*. 
In any other case it follows from Lemma 10.1 that the shortest distance on T 
between points of the set (10.8) is | Mv.| > |Mv.|. But then a suitably 
chosen subinterval /** of ZJ* contains no points of the set (10.8) in its 
interior and has a length exceeding | My, |. There are no points of the sets 
(10.7) or (10.8) in /**, This implies (10. 5)’. 

The proof of the lemma is complete. 

Lemma 10.3. If v>0 and BA Sn< 


(10.9) Bo. 
For if By, =n < By, it follows from Lemma 10.1 that 


| 
e 


32 MARSTON MORSE AND GUSTAV A. HEDLUND. 


I(n + 1,2) =| 
According to (10. 5)’ there exists an interval of T' of length exceeding | My, | 


which contains none of the points 


(10. 10) P(1),P(2),: By—1) 
and hence 
(10. 11) E{l(n + 1,«), a} > + By—1. 


It follows from (10.5) that every interval of T of length | J/,_; | con- 
tains a point of the set (10.6) and thus 
(10. 12) E{l(n + 1,4), a} S + Bo. 

The equality (10.9) is implied by (10.11) and (10.12). 

Recall that y=a(1-+ Let y=([do, di, do,- be the con- 
tinued fraction representation of y and let Cy/Dy be the corresponding v-th 
convergent. 

THEOREM 10.1. The recurrency funclion R(n,a@) increases by unity 
when n increases from n—1 to n except when n is a denominator Dy of 
y=a(1+ a)". For these exceptional values of n, 

(10. 13) R(D, = )),,, + 2D,—1 (v= 0) 
starting with D, when Do =D,. R(n,«) ts thereby uniquely determined 
for all positive integers n. 


According to (8.6) and (9.1) 


(10. 14) R(n, = p(n, y) = + 1, y),y} + 
If Dy, Sn < Dy, it follows from (10.14) and (10.9) that 
(10. 15) R(n,«) =Dv+ D»~+n—1 


and consequently if Dyv..§<n< Dy, 
(10. 16) R(n, «) = Dy + Dv. +1. 
But when « and hence y is irrational, each positive integer n not a denomina- 
tor of y lies between two successive denominators of y. Thus (10.16) holds 
if n is not a denominator of y. Upon setting n = Dy, in (10.15) we find 
that 
R(Dv-1, a) Dy —1 

excepting the case where Dy_, = Dy) = D,. 

We infer the truth of the theorem. 


THEOREM 10.2. For twrrational « the recurrency functions R(n,a) and 
R(n,1/a) are identical; conversely if R(n,%) and R(n, a’) are equal for all 


values of n, either =a or =a", 


| 
0 
a 
( 
( 
| (1 
(1 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 33 


Let 2 be the Sturmian trajectory defined by T(c,a). The function 
R(n,%) is the recurrency function of Q. The trajectory Q* obtained from 
T'(c,«) by replacing a by b and b by a has the frequency «*. But the defini- 
tion of the recurrency function of a Sturmian trajectory is symmetric with 
respect to a and 6 and it follows that the recurrency function R(n, a) of 
* is identical with R(n, a). 

To prove the converse let us assume that R(n,«) and R(n, a’) are equal 
for all values of n. It follows from Theorem 10,1 that the denominators of 
a(1-+ «)-* and of #(1-++ a’) are identical as sets of numbers. Let Dy and 
D’, be respectively the v-th denominators of #(1-+ @)- and of #(1-+ a’) 
with y= 0. We distinguish between four cases: 

Case 1. D<D, Do < 

Case Il. D=—=D, Do 

Case II]. Do =D, Dy < 

Case IV. D< 

The values assumed by the denominators Dy form a set of numbers identical 
with the set of numbers assumed by the denominators D’y. In Cases I and IT 
it follows that D; = D’; for all admissible values of 7. But this implies that 
the continued fraction developments of a(1-+ and @’(1-+ a’) are 
identical. Consequently these members are equal and «= @. 

In Case III it is clear that Dj,, = D’; for each non-negative integer 1. 
It follows that 


4 4 


It is easily shown that a’ 1. The proof in Case IV is similar to the proof 
in Case ITI. 

11. The asymptotic behavior of R(n,a). We continue with the case 
of an irrational frequency « The constant y = «(1 -+ @)~ is then irrational 
and the denominators of y form an infinite sequence 


(11.1) Dee Dy: ++. 


Given a positive integer n there exists a unique non-negative integer v such 
that Dy =n < Dy,; and according to (10.15), for these values of n 


(11. 2) Rin, 2%) =D),,,+D+n—1, (Dy 
This implies that 

Dy R(n, 


3 


d 
U 
| 
€ 


34 MARSTON MORSE AND GUSTAV A. HEDLUND. 


where the equalities on the left and right hold respectively when n = Dy,,; —1 


andn=Dy,y. Let 


Dy, 
lim sup — 


Dy 


and recall that A(«) may be infinite. 


(11. 4) lim sup 
n—>00 n 
(11.5) 
n->00 n 


understanding that A*(a) if A(a) = If A(a@) is: finite the closed 


interval 


2+A1(¢) S7S2+A(a) 


will be called the limit range of R(n,«)/n corresponding to a. If A(a) is 
infinite this limit range shall be 22 < o. 


THEOREM 11.1. The limit range corresponding to any irrational « is of 
length at least 1 and on this range the numbers R(n,a)/n, (n= 1, °°), 


are everywhere dense. 
It follows from (10.1) that 
Doar Dy-1 


Rin, 


==\(a) = 1, 


It follows from (11.3) that 
a) 


=2 + A(«), 


=2 + A*(a), 


> 0 
Dy Dy d (v ) 
and hence if A(@) is finite 
lim sup D, — lim inf D, = A(a) May = 1. 


The length of the limit range in this case is least 1. If A(a«) = o, the length 
of the limit range is evidently infinite. 

To prove that the set k(n, «)/n is everywhere dense on the limit range, 
let x be any point in the (open) interior of the limit range. It follows from 
(11.3) that there exist arbitrarily large values of v such that 


Dy 


According to (11.2), if m is an integer such that DD Sn<n+1< Don, 


then 


—1 
Dy 


R(n +1, «) “ R(n, 


n-+1 


and thus if n is a properly chosen integer 


R(n + 1, a) R(n, 
= n 


n+1 


(DSn<n+1< Dy). 


| 
| 
a 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 35 


Since R(n + 1,«) exceeds R(n,a) by 1, the length of this interval is 


R(n, @) k(n +1, @) R(n+1, —(n+1) 
n n+1 n(n +1) 

But n becomes infinite with v so that the length of this interval approaches 

zero. ‘This implies that the set R(n,a)/n is everywhere dense on the limit 

range, and the proof of the theorem is complete. 

The Sturmian trajectories thus yield no example of a non-periodic Stur- 
mian trajectory with recurrency function R(n) such that R(n)/n has a finite 
limit as n becomes infinite. Whether or not there exist more general non- 
periodic recurrent trajectories such that this limit exists is at the present 
unknown. 

The numbers « and @ will be said to be equivalent if there exist integers 
a, b, c, d with ad — be = + 1 such that 

+b 
ca+d* 

THEOREM 11.2. The limit ranges corresponding to equivalent trrational 
values of « are identical. 


Let and be equivalent irrational numbers. Then «(1 -+ and 

a’(1-+ @’)~ are irrational and can be represented by continued fractions 

It is readily shown that the equivalence of « and « implies the equivalence of 
a#(1-++ a) and #(1-++ 4’) and consequently (Perron, p. 65) there exist 
integers k and j such that 


Let Dy be the denominator of the v-th convergent of #(1-+ a) and let Dv 


be the denominator of the v-th convergent of #(1 +’). Then (cf. Perron, 
p. 32) 
Drsi 
K+i-1 
Disi 


Li 


lim ( Dusi D 543 ) 
4-00 D 544-1 


A(a) =A(a’). 


It follows that 


and hence 


The theorem follows directly. 


| 
| 


36 MARSTON MORSE AND GUSTAV A. HEDLUND. 


THEOREM 11.3. The limit range corresponding to (V5 +1)/2 is the 
interval | 
3 5+ V5 | 

2 2 


(11. 6) 


The limit range corresponding to any irrational « which is not equivalent to 
(V5 + 1)/2 contains the interval (11.6) in its interior. 


If a= (V5 +1)/2, 


V5- -] 
2 = [0, 1, 1,1, ] = [4o, di, 
But then 
Dyas 
and hence 
V5+1 
A(a) = lim = 
(a) = lim [1,1,1,-- J = 
Thus the limit range corresponding to « = (V5 + 1)/2 is 
3 5 5 
It follows from Theorem 11. 2 that the limit range corresponding to any | 
a which is equivalent to (V5 -+1)/2 is also the interval (11.6). 


Suppose is not equivalent to (V5 + 1)/2. Let [do, repre- 
sent the new value of y and let Cy/Dy be the v-th convergent of y. It follows 
from Perron, p. 65, that there exists no integer k such that 


1 = dy = drat 
Consequently there exists an infinite sequence v,<v.<--- such that 
= 2, (1 = 1,2,---). But then 
a= dy, ds, d; | = 2, 
Dy, 


and hence 


which includes (11.6) in its interior. 
The proof of the theorem is complete. 


THEOREM 11.4. If a= (V5 + 1) /2, 


dD, 
= lim sup = 2. 
A(a@) im sup 
It follows that the limit range of « contains the interval 
| 5 
g 
| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 


3 R(n, @) V5 
n 2 


For the given value of « 


37 


1 
The relations (10.1) corresponding to the convergents Cv/Dy of «(1+ «)7? 
become 
1D.,.—0, Dom1, Dra + Dy, (v= 0). 
It is an easy consequence of these relations that 
(11.8) Cy = Dv, (v=0). 
Since D, = D, = 1 each positive integer n lies on an interval of the form 
Den < Doar, (v>0). 
It follows from (11.3), (11.7) and (11.8) that 
(n, Dyas ] Cy 1 1 
) n + D Dy 


By virtue of Perron, p. 48, the relation + 


1 l a | 1 
— > | — — —— | = | —— 
Dy | Dy l+a | | dD, (x > 0) 
It follows that 
<0 (v > 0) 
and upon using (11.9) that 
R(n, 4) 
n 2 
Similarly it follows from (11.3) and (11.8) that 
R(n, Dy i Cvas 
But 
| 1 1 
0 
and hence 
a) 3+ V5 5) 


The proof of the theorem is complete. 


It is clear that R(n, «) becomes infinite with n. For an irrational value 


| 


38 MARSTON MORSE AND GUSTAV A. HEDLUND. 


of « it follows from SD, p. 830, that R(n,«) = 2n without exception. The 
following theorem is concerned with a more precise description of the manner 
in which R(n,«) becomes infinite with n. It is natural in analysis of this 
character to be concerned with R(n,«) for “almost all values” of a rather 
than with the function R(n, «) for individual values of «. 

Let ¢(x) be a function which is positive and non-decreasing for «= 0 
and such that lim ¢(r) =+ o. 


THEOREM 11.5. For almost all « 


is finite or infinite according as the series 


>> 
n=0 


1 


is convergent or divergent. 


Since the rational values of « form a set of measure zero, it is sufficient 
to prove the theorem for almost all irrational values of a. 
Let 


R(n, «) 


If @ is irrational, it follows from (11.3) that 
S = lim: 
(2) Deb (log Dey Dep (log Dy) 
where Dy is the v-th denominator of «(1 -+ @)-'=y. With the aid of (10.1) 
we infer that 


dva1 
11. 1 ~ 
( 0) S(a) (log Dv) 


where the integers dy are those appearing in the continued fraction 
[do, ] representing y. 
But it is known (cf. Lévy, p. 289) that for almost all values of y and 
hence for almost all irrational values of « 
(11. 11) lim VD. = K, 


where K is an absolute constant such that 3< K <4. On applying (11. 11) 
to (11.10) we infer that for almost all irrational values of 


dy V+1 
(11. 12) lim neup log 4) = = S(a) sup (ing 3) 


According to a theorem of Borel and Bernstein (cf. Koksma, p. 46, 


R(n, «) 
ng (log m) 
] 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 39 


Theorem 14), if w(x), «= 1, is a function which is positive and non-decreas- 
ing, the relation 


by = O(y(r)) 
is true for almost all y or false for almost all y according as the series 


= 
is convergent or divergent. If the series 

1 
(11. 13) ~ $(v) 
is convergent, it follows that the series 

1 

is convergent. On setting y(x+1)—¢(rlog3), x20, we infer from 
the Borel-Bernstein theorem that for almost all y 


= O((v log 3) ). 
It follows from (11.12) that S(«) is finite for almost all a. 
If the series (11.13) is divergent, it is easily shown that due to the 

hypotheses on (x) the series 

> 1 

$(v log 4) 
is divergent. On setting y(x+1)—¢(xlog4), x= 0, we infer from the 
Borel-Bernstein theorem that for almost all y the relation 


= log 4) ) 


does not hold. It follows from (11.12) that for almost all a, S(@) is not 
finite. 

The proof of the theorem is complete. 

If we set =z, x21, and = 1, OS the following 
corollary is an immediate consequence of Theorem 11. 5. 


Corotiary 11.1. For almost all « 

R(n,%) 

lim sup + 0. 
This implies in particular that for almost all « 


n 


lim sup 
n->0O 


If a> 1 and we set = 24,24 = 1, = 1, 0 =1, in Theorem 
11.5 we obtain the following corollary. 


_ 


40 MARSTON MORSE AND GUSTAV A. HEDLUND. 


11.2. If a>1, 


R(n, « 
lim sup Rha = 
n)? 


except for a set of values of « of measure zero. 


? 


We have excluded the case in which « is rational because for «= q/p 

with (q,p) =1 the periodic Sturmian trajectory with frequency « has the 
period = p+ q and 
(11. 14) R(n, 2) =o +n—1, n=o. 
If « is an integer, (11.14) holds for n>0. If @ is rational but not an 
integer, (11.14) does not hold for all n < , as we shall see. However, the 
values of R(n,%) for n <w can be determined by methods similar to those 
applicable to the case when «@ is irrational. 

If « = q/p, where (q, p) = 1 and ~ is not an integer, and if we set 

p+q’ 


y is not an integer and admits a unique representation in the form of a con- 


tinued fraction (cf. Perron, p. 30) 

The recursion formulas (10.1) determining the successive convergents Cy/Dy 
of y are valid forv=yp. We state the following theorem without proof. 

THEOREM 11.6. If where (q,p) =1 and Theorem 
10.1 holds forn< Dus. Forn= Dy, 

R(n,«) =p+q—1. 

By means of this theorem it is easily shown that Theorem 10. 2 is valid 
for positive rational as well as irrational values of a. 

12. Sturmian sequences in differential equation theory. We are con- 


cerned here with linear homogeneous second order differential equations with 
coefficients which are continuous in the independent variable z We shall 


make use of the important canonical form 

(12. 1) + =0. 

We assume that (x) has the period 1. Corresponding to an arbitrary solu- 

tion u(x) of (12.1) with u340, let 7’(u) and T’(u) be respectively cell-series 
aB_aB .aB,a:: - 


in which B,, is the number of zeros of uw on the intervals 


nSr<cn+1, 


A 

i ( 

| 


SYMBOLIC DYNAMICS II. STURMIAN TRAJECTORIES. 41 


It follows from the well-known Sturmian separation theorem that 7'(u) and 
T’(u) are Sturmian in the sense of §2. Moreover the frequency « of 7'(w) 
or of Z’(u) depends only on $(a) and not upon the choice of the solution w. 
We may refer to « as the frequency of (12.1). 

The cell-series T7'(w) and T’(w) include all of the types which we met 
in the general study. Consider for example the equation 


y’ +a*y=0 

where a is a positive constant. The corresponding cell-series T7(u) and T’(u) 
have the frequency a/7 —=« and, as is easily seen, include all of the types 
T'(c,%) and T’(c,«), respectively, for suitable choices of a and of wu. When 
«a = (0 we have the solution « The series 7'(x) is skew Sturmian of the form 
S’(0,0). To obtain skew Sturmian series 7'(w) more general in form it is 
necessary to go somewhat deeper. 

We recall a few facts in the classical theory of differential equations of 
the type (12.1). Let y(z) and w(x) be solutions of (12.1). Keeping x 
real we admit solutions of (12.1) of the form Ay(av) + Bw(x), where A 
and B are complex constants. Let p be an arbitrary positive integer. As is 
well known, there exists at least one solution u(x) of (12.1) such that 


(12. 2) u(x + p) =pu(e), (pO), 


where p is a real or complex constant. We term p a characteristic root of 
index p. The roots p satisfy a quadratic equation, the product of whose roots 
is 1. There are two principal cases according as the roots p are real and 
positive, or not real and complex. There is also the degenerate case in which 
the roots are equal. The equation (12.1) possesses a canonical pair of inde- 
pendent solutions whose properties depend upon the classification of the roots p. 
It would not be difficult to show the precise connection between these canonical 
forms and the types of trajectories 7'(w) and 7”’(w) defined by the solutions 
of (12.1). We shall not go into details beyond proving the following theorem, 


THeoreM 12.1. In case the differential equation (12.1) possesses two 
real positive unequal characteristic roots of index p, then for a suitable choice 
of the origin and of the solution u(x), the series T(u) and T’(u) are shew 


Sturmian trajectories with a frequency of the form q/p. 


Let c be one of the roots of index p. The reciprocal of ¢ is another such 
root. We seta=—p loge. It is easy to prove that there are two independent 
solutions of (12.1) of the form 


§ y(z) =eA(z), 


( =e "B(z), 


| 


42 MARSTON MORSE AND GUSTAV A. HEDLUND. 


where A(z) and B(x) have the period p. Moreover 
y(z + p) =cy(z), 
1 
+ p) == w(c), 


and the only solutions u(z) of (12.1) which satisfy a relation of the form 
(12.2) are the constant multiples of y(z) and w(z). 

Suppose that A(x) vanishes q times on the interval OZ 24< p. The 
function B(z) likewise vanishes q times on 0 = 2 < p since the zeros of y(zx) 
and w(z) mutually separate each other. The series T(y) has the frequency 
q/p as do the series T’(y), T'(w), ete. 

Let » be a point which is not a zero of y(x) nor of w(x), and let u(x) 
be a solution which vanishes at » without being identically zero. Then 
u(w + p) ~0. Otherwise for some constant p > 0 


+ p) =pu(z). 


As we have seen, u(x) would then be a constant multiple of y() or of w(z), 
contrary to the hypothesis that » is not a zero of y(x) or of w(x). The q-th 
zero w of u(x) following » is such that wo’ —w~p. If o’—wo< p, then 
after a suitable change of codrdinates of the form 2 =z + 2», the interval 
0<2’ <p will include both » and wo’ and T(wu) will possess a p-chain of 
b-length g+1. If o’—w>p and @ is suitably chosen, the interval 
02’ =p will include just ¢g—1 zeros of u(x), and T(u) will possess a 
p-chain of b-length g—1. In either case the series 7’(w) as well as the series 
T’(u) is skew Sturmian, and the proof of the theorem is complete. 


THE INSTITUTE FOR ADVANCED STUDY. 


BIBLIOGRAPHY. 


Birkhoff, G. D., Dynamical Systems, American Mathematical Society Colloquium Pub- 
lications (1927). 

Koksma, J. F., “ Diophantische Approximationen,” Hrgebnisse der Mathematik und 
thre Grenzgebiete, IV, 4 (1936). 

Lévy, Paul, “ Sur le développement en fraction continue d’un nombre choisi au hasard,” 
Compositio Mathematica, vol. 3 (1936), pp. 286-303. 

Morse, Marston and G. A. Hedlund, “ Symbolic Dynamics,” American Journal of Mathe- 
matics, vol. 60 (1938), pp. 815-866. 

Perron, Oskar, Die Lehre von den Kettenbriichen, Teubner (1929). 


{ 
i 


ON THE METHOD OF FINDING ISOTROPIC STATIC SOLUTIONS 
OF EINSTEIN’S FIELD EQUATIONS OF GRAVITATION.* 


By P. Y. Cuov. 


1. Introduction. The isotropic static solutions of Hinstein’s field equa- 
tions can be obtained by solving a set of differential equations given by the 
author. For the type of problem which involves the determination of iso- 
tropic fields of a single body the complete solution of the problem reduces 
first to the solution of a non-linear partial differential equation of the second 
order (I, (3.4)), and secondly to the transformation of ds? in the (u, v, w) 
codrdinates to the canonical form (I, (2.4)). When the problems are simple 
such as the examples given in I, where the non-linear partial differential 
equation degenerates into an ordinary equation, these two steps can be accom- 
plished with ease. But in the general problem the solution of the partial 
differential equation and the transformation of codrdinates are both difficult. 

An alternative way of approach is to compute the equations (2.5) in I 
in terms of the (x, y,2) codrdinates in the canonical form (2.4). Then we 
have two dependent variables, U and oa, satisfying seven partial differential 
equations with three independent variables, x,y,z. Unfortunately these equa- 
tions are non-linear in U and o and the determination of their general 
solution is by no means simple. 

In the present paper we shall derive from (2.5) in I, as a further neces- 
sary condition, another set of partial differential equations whose solutions also 
satisfy the field equations. The advantage of the present treatment over the 
previous one is, as we shall show presently, that we have to solve a set of seven 
non-linear partial differential equations of the third order satisfied by only 
one dependent variable o while the other function U can be constructed out 
of « and the partial derivatives of o. Out of the seven partial differential 
equations we shall derive the well-known Laplace’s equation. Hence as a 
method of procedure we can choose a harmonic function and this harmonic 
function will define an isotropic static gravitational field provided it satisfies 
the remaining six partial differential equations simultaneously. As an illus- 
tration of the present method we shall show that Kasner’s solution (I, (3. 5) ) 
and the field of the semi-infinite plane (I, (3.9) ) are the only two-dimensional 
isotropic static fields in Einstein’s theory of gravitation. 


* Received October 30, 1938. 
1P, Y. Chou, American Journal of Mathematics, vol. 59 (1937), p. 754, which will 


be referred to as “I”, eq. (2.5). 
43 


92 


44 P. Y. CHOU. 


2. Equations determining o. We write down some of the important 
equations in I. The field equations (1, (2.3)) are 


(2.1) Gij = Rij + =0, Go, = 0, Goo = UU = 
The canonical form of the isotropic static arc element (I, (2.4)) is 
(2. 2) ds? = U*dl? — + dy? + dz’). 


The equations determining the isotropic solutions of the field equations can 
be written as (I, (2.5)), 


(2. 3) Ui; — 4 jU*U,), and 
From (2.3) we can derive as an integral (I, (2.6)), 

(2. 4) = — k?(c + U?)*. 


We can eliminate U;,; between (2.1) and (2.3). Then 


gijU"Un), =0. 


(2. 5) 


C+ U? 
In fact (2.5) was obtained when we proved the sufficiency of (2.3) determin- 
ing the isotropic solutions of (2.1) (I, (2.17)). Now let us form the 
invariant 2”"Rinn from (2.5): 
By using (2.4) we obtain as a consequence 
(2. 7) (c + U?)® = 

Equations (2.5) and (2.7) are tensor equations and, accordingly, hold 
in any system of codrdinates. If we compute Ri; from the form (2.2), we 
find * 

(2. 8) Rij = — + 10,5 — gij (Azo + Ajo) 
where 

Avo = ga, ij, = 
If we calculate o,;; explicitly, we get 


Ry = — + oy? + — 3 + oy" + a2), 


= — Cay — 


(2. 9) 


where we put orc = 0°0/02z", oy = 40/dy, etc., and the other components of 
the tensor 2; can be obtained by cyclic permutations of the codrdinates 7, y, 2. 


2L. P. Eisenhart, Riemannian Geometry (1926), p. 9), eq. (28.6). 


| 
i 
i 
( 
| 
W 


yf 


ISOTROPIC STATIC SOLUTIONS OF EINSTEIN’S FIELD EQUATIONS. 45 


Now we can eliminate U between (2.5) and (2.7) with the under- 
standing that the components of Ri; are given by (2.9). Then we see that 
(2.5) are seven non-linear partial differential equations of the third order in 
a. Since o has to satisfy seven equations simultaneously, its number of 
solutions must be limited. But if we can obtain one solution, then the 
corresponding function U is determined automatically by (2.7). 

The conditions (2.5) and (2.7), with Ri; given by (2.9), are thus 
necessary for the field equations (2.1) to possess isotropic solutions in empty 
space. They are also sufficient. For from (2.5) and (2.8) we have 


(2. 10) Rix j 


6 2 2 

4U 


If we contract (2.10) by g‘’ and U* separately and eliminate the intermediate 
expression U"U;,;, we find (2.3) again. Then the field equations (2.1) are 
also satisfied by the theorem proved in I. Hence we may summarize the above 


results in the following theorem: 


THEOREM. A necessary and sufficient condition for the static field equa- 
tions (2.1) to possess isotropic solutions in empty space is that the function o 
should satisfy equations (2.5) where Ri; and U are given by (2.9) and (2.7) 
respectively. 

The theorem proved in I lays emphasis on the function U and the present 
theorem deals primarily with o. The method of obtaining o is as follows: 
From (2.5) and (2.9) we find 


el? (Fao + fy + fez) =0; f=e?2, 


In other words f satisfies the classical Laplace’s equation. Since Laplace’s 


equation possesses a large variety of solutions, the isotropic static fields are 
defined by those which will also satisfy the other equations in (2.5). Hence 
we may test whether any harmonic function f defines an isotropic static field 
by simply constructing Ri; and U according to (2.9) and (2.7) and see 
whether every member of the equations in (2.5) is verified. 

We have dealt with the case in empty space only. A corresponding theorem 


within matter can also be proved in a similar way. 


3. Two-dimensional problem. Although we have given the general 


>» 


46 P. ¥. CHOU. 


method of finding isotropic static fields in the previous section, it is still rather 
laborious to solve the general three-dimensional problem. On the other hand 
when we restrict ourselves to the two-dimensional case, the present method 
with slight modifications can give us all the isotropic fields; it is decidedly 
simpler than trying to solve the set of equations (2.3) directly, for even in 
the canonical form (2.2) when the codrdinate z is absent from the functions 
U and a, (2.3) are still non-linear and not much information can be obtained 
from these equations. 

Since both o and U are assumed to be independent of z, we have from 
(2.9), 


Ry, =— + oy? — (027 + [U2 + U,’)], 
Rog = — + — + oy?) = 4(U.? + U,?)], 


6 


(3.1) 


= — Gay — = — 


the other components of /;; all vanishing identically. According to the given 
procedure we should eliminate U from (3.1) by using (2.7). But this is 
clumsy and we shall avoid using it. Instead, from (2.4) and R33 in (3.1) 
we find 

(3. 2) (c + U?)*® = (2? + o,?) /4k’. 


Furthermore we can drop out the common expression f;; in Ri, and Re» 
and get 


+ Cx" 6U27/(¢ + U’), 
(3. 3) Res: Oy + = 6U,7/(e + 
Bus: Cry + = 6U2Uy/(e + 
The harmonic function f in (2.11) can now be taken to be 


(3. 4) feel? — F(a + iy) + F(e—iy) =P + F*, 


where F is an analytic function of the complex variable «+ iy. Then the 
partial derivatives of o with respect to x and y can be computed. If we denote 
the derivative of F with respect to its argument by F, we find (3.2) to be 


(3. 5) (c + U*)® = 4FF*/kf°. 


By means of (3.4) and (3.5) we can put (3.3) in the following form: 


i 


ISOTROPIC STATIC SOLUTIONS OF EINSTEIN’S FIELD EQUATIONS. AY 


6F 6F*>? 


Then we form linear combinations of the above equations. The expression 
Ri: + Roe gives 


f 


which in combination with (3.5) also gives (2.4). From R:,— Roo we find 
(3.8) 
(F*  6F*\? 
6U f F* f 


From #,, in (3.6) and (3.8) we can easily see that the following equation 
and its conjugate must hold: 


2s 6. sry’ 
F f 


which can be simplified into 
+ /F? — 6)? = 0. 


Since only U? presents itself in the arc element (2.2), we may take the 
positive sign in front of U, namely, 


(3. 10) fF/F? =6V—c/(U + V—o). 


Taking (3.9) and its conjugate, we can eliminate U? from (3.7) and obtain 


(9) 

d d 1 di d ) 

f dz dz* i) dz* F* 


Inserting F from (3. 10) into (3.11) and simplifying, we find finally 


(3. 11) 


(3. 12) [V—c+ (V—c)*] U =0. 


1 | 
| 
e 


48 P. Y. CHOU. 


In other words, the arbitrary constant c must either be zero or a positive 
constant which can be taken to be unity without the loss of generality. 


Case (1). c=0. Then from (3.10), we find 
(3. 13) = 0, F=b2z/2, and 


From (3.5) we get U? = 1/b*x* so that k —b. This is Kasner’s solution 


for an infinite plane (I, (3. 5)). 


Case (2). c¢=1. Then F must be different from zero and (3.11) can 


be written as: 
(3. 14) dF? + dF* dF*? 


Since z and z* are actually independent, we may conclude that: 


where « must be a real constant. In fact « can be put equal to zero without 
any loss of generality, for it only adds an imaginary constant to F and in the | 
final expression of f in (3.4), it does not appear. Hence integrating (3.15), 

we find 

(3. 16) = — (Bz), 


in which £ is a constant of integration. 

We have two expressions of U from (3.10) and (3.5). They must be 
identical so that BB* = 64k?. Then 

1 
(3.17) U? =tan?(¢+58)/2, e# ft = cos* (¢ + 8) /2, 
Pp 

where 8 is the amplitude of the complex number £, ¢ and p are the amplitude 
and modulus of «+ iy. This represents the field of the semi-infinite plane 
(I, (3.9)). In other words Kasner’s solution and the field of the semi- 
infinite plane are the only two-dimensional isotropic static fields in empty 
space according to Hinstein’s theory of gravitation. 


THE NATIONAL SoutH-WEst ASSOCIATED UNIversITy (being a wartime union of 
National Tsing Hua University and National Peking University of Peiping, and 
Nankai University of Tientsin), 


KUNMING, YUNNAN PROVINCE, CHINA. 


dit 


V 
‘ 
q 
p 
a 
( 
C 
ti 
de 
ti 
i, 
F 
al 
T 
wl 


ON THE ALMOST PERIODIC REHAVIOR OF THE LUNAR NODE.* 


By AuREL WINTNER. 


Introduction. In view of Newton’s deduction of his approximation 
formula for the mean motion, w, of the ascending node of the lunar path (cf., 
e.g., Tisserand [6], pp. 42-44), the constant » may be characterized, from 
the astronomical point of view, not only as the average velocity of the nodal 
angle ) = #(¢), but also in terms of the relative number of times the Moon 
passes through the ecliptic (on the average). Correspondingly, the precise 
form of Newton’s approximation theory, as developed by Adams by means of 
infinite determinants, is directly based on the Jacobian differential equation 
which determines the ordinate z= z(t), if the ecliptic is the (a, y)-plane 
(cf., e. g., Tisserand [6], pp. 286-288). 

From the mathematical point of view, there immediately arise several 
questions. Some of these have been investigated by Levi-Civita [4], who 
proved that the two definitions of » (those based on #(t) and z(t), respec- 
tively) are equivalent, and that the limit which » is supposed to represent 
actually exists; so that, in the theory of Adams, 


(1) &(t) + y(t), where = const. and | y(t) | < Const. 


The present paper deals with certain analytical refinements of Levi- 
Civita’s result; refinements which, though of apparent astronomical significa- 
tion, can only be treated by using analytical tools developed recently (Levi- 
Civita’s paper appeared in 1911). 

The modern theory of the Moon, as originated by Hill and further 
developed by Brown (cf., e. g., Poincaré [5]), is based on certain tacit assump- 
tions which, for the case at hand, imply that the non-secular part of #(t), 
i.e., the remainder term y(t) of (1), may be analyzed into an anharmonic 
Fourier series. This assumption will be justified by proving that y(t) is 
almost periodic (almost periodicity will always be meant in the sense of Bohr). 
The formal situation is as follows: 

The variational equation of Adams for the ordinate z is of the form 


(2) 2” + f(t)z=0, 


where f(t) is a given periodic function of the time. One can write this 
differential equation of the second order in the form 


* Received January 3, 1939. 
49 
4 


al 
} 


50 AUREL WINTNER. 


(3) w’ =a(t)u+ v =c(t)u+ d(i)v 


of two differential equations of the first order. Then, on introducing into the 
(u, v)-plane polar coérdinates by placing 


(4) u= (u? + cos v = (u?+ sin 


one can conclude from a general theorem, that not only is #(¢) of the form 
(1) but, in addition, y(¢) is almost periodic (cf. Wintner [9]). 

However, this approach to the problem is based on the assumption that 
the codrdinate # = #(t) of the ascending node of the Moon is identical with 
the angle 0 = #(t) which is defined by (4). Now, the latter 3, being the 
polar angle in the (u,v)-plane, clearly is not the lunar node, if one writes 
(2) in the form (3) by placing u=z,v=—2 (oru=2,v=z). Fortunately, 
it turns out that the polar angle in the (u,v)-plane becomes identical with 
the nodal coérdinate 0 = 0(t) of the Moon if, instead of identifying wu, v with 
z, 2’, one subjects the pair z, 2’ to a suitable linear substitution whose matrix 
is a certain periodic function of ¢, and then defines wu, v as the resulting linear 
combinations of z, 2’. 

Due to the relations which Levi-Civita used when identifying the two 
definitions of w (cf. the beginning of this paper), the linear transformation 
defining u,v is quite explicit. Correspondingly, the existence of » and the 
almost periodicity of y(t) will be proved directly. This direct proof will not 
involve an actual modification of the program sketched above. In fact, the 
proof depends, in either case, on an application of the following theorem, 
formulated as a conjecture by the present author, and subsequently proved 
by Bohr [1]: 

If #(¢) is real and w(t) almost periodic, then there exist a constant 
and an almost periodic function y(t) such that 0(t) =ot + y(t). (The 
converse of this theorem is obvious.) 

Since the proofs will be based on the theory of almost periodic functions, 
the proofs are independent of the results of Levi-Civita (cf. the beginning of 
this paper), which, therefore, follow as corollaries. 

The almost periodicity of the remainder term y(t) will also imply the 
existence of an asymptotic distribution function for the angular variable #(¢). 
This means that there exists an asymptotic probability p = p(a, 8), that the 
lunar node #(¢) will lie on a given are a= V(t) =, where }— V(t) is 
thought of as reduced mod 2z. Needless to say, (1) in itself would be insuffi- 
cient to guarantee the existence of such an asymptotic distribution function. 


1. The considerations of this section are more general than those actually 


( 

( 
f 


~ 


ON THE ALMOST PERIODIC BEHAVIOR OF THE LUNAR NODE. 51 
needed for the problem at hand, and concern the explicit characterization of 
all (non-conservative) linear canonical transformations in case of n degrees 
of freedom. The result, though of algebraic simplicity, does not seem to occur 
in the classical literature of the subject, apparently because the verification 
requires some effort, if one starts out with the standard Pfaffian criterion of 
canonical transformations. On the other hand, the result follows quite 
naturally by using a method developed recently (cf. Wintner [8], van Kampen 
and Wintner [7]). 

teserving the sign ’ for d/dt, let A* denote the transposed matrix of the 
matrix A (although all matrices occurring will be real) ; so that A’* = A*’, 
where A = A(t). Correspondingly, the bilinear form belonging to a matrix 
B will be denoted by Y*BX, where X, Y are real column vectors. Let EH be 
the n-rowed unit matrix, O the n-rowed zero matrix, and J the 2n-rowed 
skew-symmetric matrix 


(5) so that — /* — — 7, det = +1. 
—EHO 

Then the most general linear (homogeneous) Hamiltonian system with n 

degrees of freedom is 

(6) [X’ == §(t)X, 


where S(¢) is an arbitrarily given, 2n-rowed, symmetric, possibly singular, 
continuous matrix function of the time ¢. In fact, if -, denote the 
components of the vector XY, one can write (6) as 


x’; 0H/02i.n, tn = 0H (i = n),; 


where H = H(X;¢) is the quadratic form H =4X*S8(t)X; so that risn, 
where 1=1,- - -,m, is the 1-th coordinate, and z; the momentum canonically 
conjugate to Zi,n. 

If one subjects X to an arbitrary linear substitution 


(7) PN, 


where T(t) is a matrix function of (2n)? elements which have continuous 
first derivatives and a non-vanishing determinant, then (6) clearly is trans- 
formed into a system of 2n differential equations of the first order which are 
again homogeneous and linear and can, therefore, be written in the form 


(8) IX’ == §(t)X, 


the matrix (5) being non-singular; so that S(¢) is uniquely determined by 
S(t), T(t) and the derivative matrix T’(t). However, the transform (8) of 


52 AUREL WINTNER. 


the canonical system (6), is not, in general, again canonical, since S(t) need 
not be symmetric matrix whenever S(¢) is symmetric. Correspondingly, (7) 
may be defined to be a canonical transformation if it has the property that 
(8) is a canonical system for every canonical system (6), i. e., if S*(¢) = S(t) 
whenever S*(¢) = S(t). 

Now, 7'(¢) will have this property if and only if there exists a scalar 
constant p= 0 such that 


(9) T*(¢)IT(t) = pl for every (w’=0). 
(It may be mentioned that (9) implies that 
(9 bis) det T(t) = p", (det T(t) #0), 


and not only that | det 7(¢)|—J|y|"). Furthermore, if the necessary and 
sufficient condition (9)—(9 bis) for a canonical linear transformation (7) is 
satisfied, the matrix S(t) of the transformed Hamiltonian function, i. e., of 
the quadratic form 44*S(t)X, follows from 


(10) T*ST =pS + T*IT’, where S=S(t), T=—T(t), det T(t) #0: 


(so that (10) is a symmetric matrix for every symmetric S(¢) if, and only 
if, (9) is satisfied). 

The proof of the statements (9), (10) will be omitted. For, on the one 
hand, these statements may be verified from the general (non-linear) results, 
obtained loc. cit. [7], at least if one disregards the fact that, this time, only 
the existence of a first continuous derivative T’(¢) is required (the problem 
being linear). And, on the other hand, a direct'verification proceeds in exactly 
the same way as in the particular case T(t) = const, treated loc. cit. [8]. 


2. Suppose, in particular, that (7) transforms momenta into momenta 
and coordinates into codrdinates; so that 


(11) 


where A(t), B(¢) are non-singular n-rowed matrices. It is easily verified from 
(5) that (9) is satisfied by (11) and »-+ 1 if and only if A*(¢) = B(t). 

It follows that in the particular case A = B of a cogredient transforma- 
tion of the momenta and coordinates, the condition is that A(t) be, for every ¢, 
an n-rowed orthogonal matrix (of determinant +1). Application of (10) 
to this particular case shows that the transformed Hamiltonian function is 


(12) —4X*S(t)X + H{U*A'(t) A*(t) V — V*A'(t) A*(t)*0}, 


| 
| 


ON THE ALMOST PERIODIC BEHAVIOR OF THE LUNAR NODE. 53 
where U and V are the vectors with n components, (;) and (Zin), which 
are formed by the n first and n last components of the vector (7) with 2n 
components. 

For instance, if the degree of freedom is an even number, say n = 2m, 
condition (8) is satisfied (with »—-+1) if T(t) is the 2n-rowed matrix 
which one obtains by repeating, 2m times along the principal diagonal, any 
two-rowed rotation matrix 


Cos — sin d 
sin COs 


(13) —( where ¢ = ¢$(t) 

is any given scalar function which has a continuous derivative ¢’(t). In this 
case, the quadratic form 4 { }, which in (12) represents the deviation of the 
new and old Hamiltonian functions, readily reduces to 


k=1 
if 3, Lon-1, Zen are respectively denoted by £1, Hi, Ze, -, 


Em, nm, Where m = $n. 


3. Let the degree of freedom be n — 1, and write p,q; u,v for 1%, 22; 


respectively; so that (7) becomes 


It is easily verified from (5), where H = 1, O =0 in the present case, that 
the necessary and sufficient condition (9) for a linear canonical transformation 
is satisfied by (15) if and only if det 7(¢) —const. But const. yp, by 
(9 bis) ; so that the criterion takes the form 


(16) a(t)8(t) — B(t)y(t) =p, where p = const. ~ 0. 
It follows, therefore, by straightforward reductions that 


2Aya(t Apy(t) — Aaa(t 
(17) 4uIT’(t) T(t) py(t) ( 


Apy(t) — Aas(t) 2Aag(t) 
if the elements of this two-rowed matrix denote the determinants defined by 
(18) Any(t) =x’ (t)A(t) —A’(t)«(t), where v’ = dv/dt. 


if, in particular, the constant (16) is 1, then, on denoting the new and old 
Hamiltonian functions, i.e., the quadratic forms $X*S(t)X and 3X*S(t)X, 
by K(u,v;t) and H(p,q;t), one sees from (10), (7) and (17) that 


54 AUREL WINTNER. 


(19) K(u,v;t) =H (p,q; t) + + — Acs) uv + 
where H(p,q;t) is thought of as expressed by means of the inverse of the 
substitution (15) as a function of (u,v;¢), and the A are, in view of (18), 
given functions of ft. 

4, Let 
(20) 2” —2y =O2(2,y,2), + 2a’ =Q,(z, y, 2), 2” y, 2) 
be the equations of motion of the (non-planar) restricted problem of three 
bodies in a synodical barycentric codrdinate system (2, y, z) ; so that the axis 
of syzygies is the a-axis which rotates, with reference to a sidereal -planar 
coordinate system which coincides with the (x, y)-plane, with constant angular 
velocity. Thus, (20) is an irreversible, conservative, dynamical system with 
three degrees of freedom, admitting Jacobi’s integral of relative energy: 


(21) + + — Q(z, y, 2) = const. 


The classical mathematical literature of the restricted problem of three 
bodies concerns the case z(1) =0 of a planar solution 


(22) y=y(t). 


Starting with any given planar solution (22), consider the non-planar solu- 

tions in the infinitesimal neighborhood of (22) ; so that the third of the equa- 

tions (20) may be replaced by its Jacobi equation belonging to (22). Then 

_ 2==2(t) is determined by the equation (2) of Adams, in which the coefficient 
function f(t) is obviously given by 


(23) f(t) =—2@-.(x(t), (1), 0). 


Furthermore, on denoting by # = #(t) the longitude of the ascending node, 
and by «= (¢) the (small) inclination, with reference to the synodical 


codrdinate system (zx, y), one has in (2) 


(24) ysinccosd, 2’ + y’ sine cos 4, 
at least so long as 
(25) a(t)y’(t) —y(t)a’(t) K0. 


In order to see this, it is sufficient to write down, within the degree of accuracy 
of (2), the projections of the vector product of (a, y,z) and (2’, y’,2’) on the 
codrdinate axes; cf. Levi-Civita [4], pp. 366-367 (where, however, the ascend- 
ing node is referred to the sidereal, instead of the synodical, coordinate system ; 


so that one has to replace 0(t) by 0(t) —?). 


| 
| 
| 
| | 


= 


ON THE ALMOST PERIODIC BEHAVIOR OF THE LUNAR NODE. ve 


5. In view of (24), where z,y are the given functions (22) of ¢, one 
can replace the differential equation (2) of the second order for the ordinate 
z by two differential equations of the first order for the Eulerian angles ., 3. 
In order to legalize this, it is sufficient to observe that, barring the case of the 
trivial solution z(¢) ==0 of (2) which belongs to the given generating solution 
(22) of (20), the angles 1. c(t), 9= V(t) do not become undetermined, 


since 
(26) sinu~ 0 
for every ¢. For if (26) were violated at some ¢ = to, it would follow from 


(24) that z—0 and # = 0 at this particular ¢. But then (2) implies that 


z= 0 for every 


6. It turns out that the differential equations for +, #, mentioned at the 
beginning of § 5, readily lead to the desired equations (8) in which u, v repre- 
sent certain linear combinations of z,z’ in such a wav that the requirement 
(4) of the Introduction becomes satisfied. 

To this end, put 


(27) u = — yx’)* sinc cos B, v = (ry’ — sin csin 


if the determinant (25) is positive, and modify the factors (ay’ — yz’)3 on 
the left of (27) in an obvious manner, if the continuous non-vanishing function 
(25) of ¢ is negative. According to (27), one can write (24) in the form 


(28) z= — 2’ == (ry’ — yx’)3(y’u — 


of a linear substitution of u,v into 2,2’. The coefficient matrix of this linear 
substitution is, by (22), a known function of ¢ and has, in view of (28), the 
determinant + 1 for every Hence, on placing p= z, q = 2’, and writing 
(28) in the form (15), the condition (16) for a canonical transformation is 
satisfied by » 1. On the other hand, (2) may be written in the form 
(29) p= — /04q, = 011 /dp, 


if one puts 
(30) H=H(p,q;t) =—4¢ — hf (1) p*, where p=z, q= 2’. 


Consequently, the representation of (2) in terms of the variables (27) is the 


linear canonical system 
(31) wu = — 0K /dv, v’ = 0K /du, 


where the Hamiltonian function K = K(u,v;t) is a quadratic form in (u,v) 


— 


56 AUREL WINTNER. 


and is explicitly given by (19) and (30), the determinants (18) being obtained 
by identifying (15) with (28). 

Finally, on identifying (31) with (3), one sees from (27) that the 
requirement (4) of the Introduction is satisfied, and one has 


(32) u* + v? = — yz’) sin? .. 


It should be mentioned for later application that the pair of conditions 
(25), (26) is, in view of (32), equivalent to the condition that u = u(t). 
v = v(t) do not vanish simultaneously; a condition which, in turn, is equiva- 
lent to the exclusion of the trivial solution u(t) =0, v(t) =0 of (8), i.e., 
of (31). 


7. Without assuming that the (real) system (3) has the canonical form 
(31), suppose that its coefficient functions a(t),- --,d(t) have a common 
period, say r. Suppose further that the characteristic exponents of (3) are of 
the non-degenerate stable type, i.e., that the pair of the characteristic roots 
of the monodromy group is of the form (p,1/p), where ! p! 1 but pA+1. 
Then it is readily seen from the Fuchs-Floquet representation of the general 
solution of (3), that (3) admits a fundamental matrix which is the product 
of two real matrices of the following type: One of these two matrix functions 
of ¢ is periodic, with 7 as period, while the other matrix factor not only is 
periodic but represents a uniform rotation, with a period which is determined 
by the characteristic exponent, i.e., by arg p. 

Now, the existence of a fundamental matrix which possesses a factori- 
zation ot this type clearly implies, not only that every solution u = u(t), 
a==2 u(t) of (3) is almost periodic, but also that the greatest lower bound of 
u* -t- v? for — 0 <t< + © is distinct from zero for all those (real) solu- 
tions u=u(t), v=v(t) of (3) for which + does not hold at 
some t = ty. 

Consequently, uw = u(t) and v =v(t) cannot simultaneously come arbi- 
trarily close to 0 for —« <t<-+ ~, if one excludes the trivial solution 


u(t) =0, v(t) =0. 


8. Now consider the case in which the given planar solution (22) of (20) 
is periodic. Then so is the coefficient function (23) of (2) and, therefore, 
the coefficient matrix of (29), or of (31). If, in particular, (22) is that 
solution of the restricted problem of three bodies which corresponds to Hill’s 
intermediary lunar orbit of his limiting case, then (25) is known to be satis- 
fied, and the characteristic exponent of (2), i.e., of (31), fulfils the stability 
condition required at the beginning of §7 (as to the numerical situation, cl. 


Tisserand [6], p. 288). 


t 
| f 
a 
1 
n 
0 
¢ 
he 
; 
as 
[f 
he 
th 
m 
a 
an 
te 


~ 


Cr 


ON THE ALMOST PERIODIC BEHAVIOR OF THE LUNAR NODE. 


It follows, therefore, from § 7 that 


(i) the solutions w= u(t), v=v(t) of (31) are almost periodic and 
have frequencies contained in the integral modul of two numbers, say of A 
and v, where it is understood that A and »y are or are not linearly dependent 
according as the period of (22) does or does not satisfy a commensurability 
condition with reference to the characteristic exponent ; 


(ii) barring the trivial solution u(t) = 0, v(t) =0, one has 
(u(t))* + (v(t))? > const. > 0 for —w <t<+o, 


where the const. depends on the integration constants of the solution u = u(t), 
v=tv(t) of (31). 


On comparing (ji) with (27), (82), and using from (i) only the fact 
that w= and v= v(t) are almost periodic, one sees that exp is 
almost periodic. It follows, therefore, from the general theorem of Bohr, 
mentioned in the Introduction, that (1) holds for a certain constant » and 
for a certain almost periodic function y(¢). 

If, in addition, use is made of the description (i) of the moduli of u(¢) 
and (7), it also follows that » and the frequencies of y(¢) are contained in 
the integral modul generated by the pair of numbers A,v (which may be 
commensurable) ; cf. Bohr [2]. 

The actual values of the integers j, & for which jA + kv becomes the mean 
motion » readily follow, for mere reasons of continuity, from an inspection 
of Newton’s approximation, i.e., of the problems of two bodies (cf. Levi- 


Civita [4], p. 376). 


9. In view of the significance of the lunar node, it is natural to ask, 
how are the values of the angle #(1) distributed asymptotically along the 
boundary of a circle 0 <0 2m. More precisely, the question concerns the 
existence (and then the determination) of a function o ~o(@), the angular 
asymptotic distribution function, which is defined for 0 < 6 = 2z as follows: 
If L7(6) denotes che sum of the lengths of those f-intervals which, on the one 
hand, are contained in the range 0 S¢< T and, on the other hand, are such 
that on their points ¢ the (continuous) angular function #(¢t), when reduced 
mod 2z, satisfies the inequalities 0 << 0(t) =4@, then there exists on 0 < 2x 
a monotone function ¢(@) which satisfies the relation o(27) —o(+ 0) = 1 
and is such that relative amount of time represented by the ratio Lr(9):T 


tends, as T7— + o, to the limit o(@) at every continuity point 6 of o. 


— | 


58 AUREL WINTNER. 


It is known (cf. Haviland [3]) that a given 9=9#(t) has an angular 
asymptotic distribution function o =o(6) if and only if all the time averages 


(33) M {exp ind (t)}, where n = 
and M{g(t)} — lim f g(t)dt/T, exist, in which case o(6) may be deter- 
0 


mined as the solution of the trigonometric momentum problem, 


(34) e™dg(0) = M {exp ind(t)}; (2 == 0,1, 2,- - -). 


0 


Now, since exp w(t), hence also exp ind(t), is almost periodic, the time 
averages (33) exist, as does, therefore, (6). 


10. In view of (i), $8, the actual determination of .o is or is not an 
“elementary ” task according as A and v are or are not commensurable. 

In the first case, exp i0(¢) is a periodic function; so that the asymptotic 
averages (33) reduce to averages over a finite t-range, and so o(6) simply 
follows from (34) by the inversion process which expresses the Lebesgue 
integrals as Stieltjes integrals. 

In the second case, the content of (i), §(8), may be expressed, with a suit- 
able choice of notation, as follows: If © is the torus 0<¢,=1,0< <¢@.=1 
which is obtained from a Euclidean (¢;, ¢2)-plane by reduction mod 1, then 
there exists on ® a continuous function of the position, say Ff’ = F (4). ¢2). 
in such a way that either 0(t) = F(t, 7) or 


(35) V(t) =ot + F(ot, 


where o is an irrational number. 

It will be sufficient to consider the case (35). Then, by Weyl’s corollary 
to Kronecker’s approximation theorem, the time average (33) may be expressed 
as the space average of exp ind; + F'(¢:, ¢2) over the torus ©. Hence, (34) 


becomes 
27 1 


(36) f exp inédo(@) = f exp in{d, + F'(d:, }ddidd: ; 


Now, on writing the Lebesgue double integral on the left of (36) as a Stieltjes 


a 
0 0 0 j 
(n =0,1,-- -). 


ON THE ALMOST PERIODIC BEHAVIOR OF THE LUNAR NODE. 59 


simple integral, one sees from the uniqueness theorem of the trigonometric 
momentum problem that the angular asymptotic distribution function o(6) 
is the area of the set of those points (¢:1,¢2) of © on which the function 
¢; + F(¢1, $2), when reduced mod 2z so as to lie between 0 and 2z, attains 
values which do not exceed 6. 


11. It is seen by comparison of the cases mentioned at the beginning of 
$10, that the description of the angular asymptotic distribution function of 
an almost periodic function exp w(t) is a problem of Diophantine intricacy. 
This situation is strikingly illustrated by the following consideration (which, 
however, cannot be applied to the problem (35) at hand). 

Let 
(37 W(t) = S an cos (Amt — &m) 

m 
be any real almost periodic function, and » any real number which is not a 
linear combination (with integral coefficients) of the frequencies A» of y(t). 
Then the angular asymptotic distribution of J(4) =ot + y(t) is the equi- 
distribution ; so that o(@) is the linear functions 6: 27, no matter what is 
the remainder term (37) of 0(t) —ot. 

In order to prove this, it is, in view of (34), sufficient to show that 


0 = M {exp ind(t) } for 1,2,°: 
since f 0 for n = 1,2,---. But M{expinot} — 0 for 
0 
n==1,2,---; while —of +y(1), where y(t) is given by (387). 


Consequently, it is sufficient to show that 


M {exp (inot)}M{exp (in > am cos (Amt — %m))} 


m 


= M {exp in(ot + an cos Am(t — ) }. 


m 


Now, the truth of the last relation may readily be verified, for every n, 


from the assumption that is linearly independent of the An. 


THE JOHNS HOPKINS UNIVERSITY. 


| 


AUREL WINTNER. 


REFERENCES 


H. Bohr, “Kleinere Beitriige zur Theorie der fastperiodischen Funktionen, I”, Det 
Kgl. Danske Videnskabernes Selskab Math.-Fys. Meddelelser, vol. 10, no. 10 
(1930). 

H. Bohr, “Ueber fastperiodische ebene Bewegungen,” Commentarii Mathematici 
Helvetici, vol. 4 (1932), pp. 51-64. 

E. K. Haviland, “ On statistical methods in the theory of almost-periodic functions,” 
Proceedings of the National Academy of Sciences, vol. 19 (1933), pp. 549-555. 

T. Levi-Civita, “ Sur les équations linéaires 4 coefficients périodiques et sur le moyen 
mouvement du noeud lunaire,” Annales de ’ Ecole Normale Supérieure, ser. 38, 
vol. 28 (1911), pp. 325-376. Cf. also Libera Trevisani (Mrs. Levi-Civita), 
“Sul moto medio dei nodi nel problema dei tre corpi,” Atti del Reale Istituto 
Veneto di scienze, letteri ed arti, vol. 71, (1911-1912), pp. 1089-1137. 

H. Poincaré, “ Sur les équations du mouvement de la lune,” Bulletin Astronomique, 
vol. 17 (1900), pp. 167-204. 


. Tisserand, Traité de Mécanique Céleste, vol. ILI, Paris (1894). 


E. R. van Kampen and A. Wintner, “On the canonical transformations of Hamil- 
tonian systems,” American Journal of Mathematics, vol. 58 (1936), pp. 851- 
863. 

A. Wintner, “ On the linear conservative dynamical systems,” Annali di Matematica, 
ser. 4, vol. 13 (1934), pp. 105-112. 

A. Wintner, “ Ueber eine Anwendung der Theorie der fastperiodischen Funktionen 
auf das Levi-Civitasche Problem der mittleren Bewegung,” ibid., vol. 10 (1931- 
1932), pp. 277-282. 


L 


co 


Sa 


60 

l. 

2. 

3. 

4, 

5. 

8. 

9. 
SE 
ir 
= 
= 
Co 
fro 
mé 
ste 


REMARKS ON A CONJECTURE OF MINKOWSKI.* 
By D. Derry. 


Let 


be n linear forms with rational coefficients and determinant +1. <A well 
known conjecture of Minkowski states that if no integral valued solution 
@1,22,° * *,%n exists, other than the solution in which all the 2’s are zero, 
for which | L,(x)| <1,---,|Zn(a)| <1 then the forms after a possible 
unimodular transformation of the x’s and a rearrangement of order have the 
form 

% 

= + 22 


Mordell has shown? that the conjecture may be stated in another form. 
Let +, be linear forms with rational coefficients and 
unit determinant which satisfy 


Condition 1. For any set of integral values 7,, 42,- - -,%n, other than the 
set in which all the z’s are zero, at least one of the forms takes a non-zero 
integral value. 

By Mordell’s result the Conjecture assumes 


Form 1. At least one of the forms has integral coefficients with no 


common factor. 
Let p be a prime which is henceforth fixed. We assume the forms also 


satisfy 


* Received April 7, 1939. 

11. J. Mordell, “ Minkowski’s theorems and hypotheses on linear forms,” Oslo 
Congress 1936. Form 1, communicated to me verbally by Mr. Davenport, differs slightly 
from the form given by Mordell. Mordell states the hypothesis in terms of the reciprocal 
matrix of the forms and the conjecture itself in terms of the original forms. This form 
states both the conjecture and the hypothesis in terms of the reciprocal matrix. 


61 


62 D. DERRY. 


Condition 2. The coefficients of the forms are rational numbers whose 
denominators are some power of the prime p. 

This note considers further equivalent forms of the Conjecture when 
Condition 2 is satisfied. These are stated in terms of finite Abelian groups 
and their normal series. 


THEOREM 1. Without restriction in generality the forms L,(x) used in 


Form 1 may be taken to have the form 


L(x) = pz, 


where *,% are integers with  SreS° Stay p pts: 
and integral for j,k =s. 


Proof. The above special form is easily derived? from any system of 
forms with unit determinant satisfying Condition 2 by rearranging the order 
of the forms and subjecting the z’s to unimodular transformations. 


THEOREM 2. L,(x),L2(x),---,Ln(x) are a set of forms with unit 
delerminant satisfying Conditions 1 and 2 and with the form of Theorem 1. 
Then the values the forms assume for a set of integral values 2, 22,° * *, Zn 
are either all multiples of p™ or at least one of the forms assumes an integral 
value which is not a multiple of pt. 


Proof. For a set of integral values 2, -,%n, let Ly, Ln be 
the values taken by the forms L,(z), L2(z),- - -,Ln(a) respectively. Let 
I, be the value with the least subscript & which is a non-zero multiple of p™. 
If no such value exists either all the forms vanish, which occurs if and only if 
2, = 0, 2 = +, = 0, or by Condition 1 at least one of the forms must 
take a non-zero integral value which we have assumed not to be a multiple 
of p™; thus if no such value ZL, exists the truth of the theorem must be 
admitted. If we replace a by the forms L,(z), L2(z),- °°; 
Iy.,(x) retain their original values while Z;,(x) takes the value zero. The 


remaining forms +, Ln(x) take values which differ from 
the original values by multiples of p™ for by Theorem 1 p-"*aj, is integral for 
j =k and we are assuming p’|L;. By repeating this process we ultimately 


replace 2,,%2,° * *,2%n by a new set of integral values which give the forms 
3 bo) 


2B. L. van der Waerden, Moderne Algebra, § 106. 


n 


REMARKS ON A CONJECTURE OF MINKOWSKI. 63 


values which we may call L’,, L’,,- - -,Z’,. Each of this latter set of values 
differs by a multiple of p’ from the corresponding ‘value of the set L,, L2,- ++, Dn. 
None of the values L’;, L’:,- - +, L’n can be a non-zero multiple of p™ In 
case all these values are zero we deduce L,, L2,: - +, Ln are all integral multi- 
ples of p'. If on the other hand one of L’,, L’,,- - +, LZ’, is different from 
zero then by Condition 1 at least one value say L’, is a non-zero integer and 
furthermore an integer which is not a multiple of p™. Consequently Ly, is an 
integer which is not a multiple of p™. The theorem is then completely 


established. 

Form 2. 1,12,° * +51 are integers not all of which are zero for which 
and 1+ is an Abelian group 
of rank n with type (pomp tm. - - ptm): § a subgroup of % of type 
prim): +,3n a system of cyclic subgroups of 


which together generate %. For every subgroup 2% of § containing § with 3/2 
cyclic it is known that a subgroup 8, exists with & => 3,"", & = 3,. 
Then a subgroup 3, exists with 6 = 3,2", § 


Proof of equivalence to Conjecture. We shall first show how Form 1 of 
the Conjecture may be stated in terms of finite Abelian p-groups. Let 
be a set of linear forms satisfying Conditions 1 
and 2 and having the form of Theorem 1. We shall further assume that 
”, <0. For otherwise from the conditions stated in Theorem 1 regarding the 
integers 1, 1'2,° * *,7'n we deduce that the coefficients of the forms are all 
integral in which case there is nothing to prove. This last assumption im- 
plies x > 1. 


As 2, %2,° © *,@n each independent of the other take all integral values 
modulo p-"*" let G be the group of vectors p"(#,, @2,° *,%n); the sub- 


group of all vectors (1,(7), L2(@),° the subgroups 


of vectors 


As the forms have the form of Theorem 1, has type prret™: 


From Theorem 2 follows that for every non-zero element A of § groups Br, Br” 
exist with A « 3,”, A \ 2, and conversely if the groups have this latter property 
the forms satisfy the condition of Theorem 2, which implies Condition 1. If 
one of the forms L,(z) has integral coefficients § = 3,” and conversely the 
existence of such a group 3,” implies that the corresponding form has integral 
coefficients. If the integral coefficients of L,(2) have no common factor 


n 
| 
f 
r 


64 D. DERRY. 


a) £ 3,’.. Again the converse is true for the relation § + 8,’ implies that 
the integral coefficients of L,(x) are not all divisible by p; because of the unit 
determinant and of the denominators of the forms none of them may have any 
other common factor. 

From a system of linear forms we have constructed an equivalent system 
of Abelian groups in terms of which the Conjecture was restated. It is possible 
to start with an abstract system of vector groups ©, 9, Brs Br’, 2” 1SrsSn, 
with all the above relationships including the fact that © have type 
and by considering an automorphism of ® on § to construct a system of 
linear forms with unit determinant satisfying Conditions 1 and 2. In other 
words to every set of groups satisfying the conditions of the above group form 
of the Conjecture a set of linear forms exists satisfying the original condi- 
tions of the Conjecture. Thus the complete equivalence of the two forms is 
established. 

Let % be the multiplicatively written character group of the group ©. 
To every subgroup 8 of & we order the subgroup B of % of all characters x 
for which y(A) =1 for Ae. It is a classical result of Weber that } = G 
and that B determines B while ¥/B = B. Accordingly if § be the character 
subgroup associated with it will have type pm), The 
dual groups 8, of the system 3, form a system of cyclic groups 3, = (Ar), 
1=r=n, which generate the group % while the dual groups of the system 
8,’, Br” are the cyclic groups (A,"""), (A,2"), 1S rn respectively. Now 
if A be an element of § the dual of the cyclic group (A) will be a subgroup 
MW of % containing § for which %/2 is cyclic. Conversely every subgroup 
W of % containing §, for which %/% is cyclic, is the dual of a cyclic subgroup 
(A) of §. Accordingly, translating the conditions of the above group form 
of the Conjecture into the character groups, we see for every such subgroup 
a subgroup 8, exists with & => 3,°, U4 8,. The Conjecture itself becomes 
under similar translation: a subgroup 8, exists with © = 8,?", © = 3,2". 
Thus the Conjecture in Form 1 is shown to be equivalent to Form 2 and the 
proof is complete. 


Definition. For an Abelian group & of order p" a series of subgroups 
= G.,- - -, = is said to form an r-series if 
cyclic and of order p" for l=s=n. 


Definition. A subgroup § of & is said to be reciprocally cyclic if the 
factor group G/® is cyclic. 


Form 3. % is an Abelian group of order p'™ with rank less than 2. 


REMARKS ON A CONJECTURE OF MINKOWSKI. 65 


€,, ©2,- - +, ©, are n cyclic subgroups of B. For every reciprocally cyclic sub- 
group D of B groups @z,, Gs.,- + -, ©s, exist so that the factors of the series 


B (G,,, ° : Sa, (Gs,, ®),® 


have order not greater than p”. Then the groups ©, ©2,---,@n after a 
possible rearrangement of order build an r-series 


B= (Ci, +, Gn),- +, (Gi, C2), Gi, (F) 
for the group %. 


Proof of equivalence to Conjecture. We first show the above would follow 
from a proof of the Conjecture expressed in Form 2. ©, ©2,- - -, @n together 
generate the group % for otherwise a group D containing ©,, ©2,---,@n 
would exist for which 8/D was cyclic and of order greater than 1. Then by 
hypothesis a group ©; would exist for which (€:,): D would be greater 
than 1, contradicting the fact that ©; = D. 

Let 1, 72,° be integers for which r; S rz and such that 
has type Now rn =r because the rank of is by 
hypothesis less than n. Furthermore the order of 8 being p'", r,; + r2 +: °° 
+7n=0. Now let be a group of type (po pt) of rank n 
generated by elements Z,,Z2,:--,Zn. If C1,C2,- be elements of B 
which generate the cyclic subgroups ©, ©2,- - -,€n respectively. we define a 
homomorphism of % on by the correspondence Cy, Zn Cn. 
This is possible because C,,C2,---,Cn generate 6 and the order of each 
element of %} is a multiple of the order of corresponding element of 8. Let 
§ be the subgroup of % which is built into the unit element of % in the 
above homomorphism. As %/ = % we deduce from the type of % and B that 
has type - For a subgroup of containing § we 
have by the second isomorphism theorem %/YU = %/O/U/H. Hence if MW is 
reciprocally cyclic, 2 is built by the homomorphism into a reciprocally cyclic 
subgroup 2 of B. Now by the hypothesis, as 7, = 7, a subgroup Cy, exists with 
1< ((Gs,D):D) S p™ from which we can conclude = (Zs). 
Thus % and its subgroups (Z,), 1S r<n, satisfy all the conditions of Form 2. 
Therefore if the Conjecture be true a number s, exists with § = (Z,,?""), 
This implies Gs?" (2), A (#) i.e. Cs, has exact 
order But as rn = 1, Gs, has order 

In the factor group let C'n be the cyclic 
subgroups of restclasses defined by ©,,- , ©s,-1, +, Gn respectively. 
By using the second isomorphism theorem, any reciprocally cyclic subgroup D’ 


5 


t 
| 
> 


66 D. DERRY. 


of B/C., may be shown to have the form D/Cs, where D is a reciprocally cyclic 
subgroup of 8. By hypothesis a series 


exists from 8 to D whose factors have order not greater than p’. As D = G,, 
it follows from the second isomorphism theorem that the factors of this series 
are isomorphic to the factors of the series 


Therefore the factors of this series have order not greater than p". We have 
proved the order of @s, is p". Hence the order of the factor group 8/Gsz, is 
p"'"-), We have thus proved that the factor group 8/G;, and its associated 
subgroups +, satisfy all the conditions of Form 3 
with n replaced by n—1. We could therefore deduce from the truth of the 
Conjecture exactly as above that at least one element @’s, has order p"™ which 
means (@,,, ©s,) : = 

By repeating this process we deduce after n steps that ©,, Gn 
after a possible rearrangement of order build an r-series 


for 8. This shows that the problem stated in Form 8 is a consequence of the 
Conjecture as stated in Form 2. 
To show the converse we need only consider the factor group %/ and 


its associated cyclic subgroups of restclasses ©,,@2,---,@n defined by 
B31, +, Bn respectively. %/ has order For any reciprocally cyclic 


D of %/H let D’ be the subgroup of % of all elements which are built into 
® by the homomorphism % ~ %/. D’ by the second isomorphism theorem 
is reciprocally cyclic, hence by the hypothesis of Form 2 a subgroup 3, exists 
with D’ = 81,2", D’ Therefore (D, €:,)/D has order greater than 1 
but not greater than p™. Now the subgroup (D, G:,) is also reciprocally cyclic 
and so proceeding exactly as before we could find a subgroup G;, with 
(C1; Ct, D)/(Cr,,D) of order greater than 1 but not greater than p™. Con- 
tinuing in this manner in a finite number of steps we could construct a series 


with cyclic factors of order not greater p™. Thus 3/6 and its subgroups 
C,, ©2.,- - -, © are seen to satisfy all the conditions of Form 3 with ~ replaced 
by rn. Therefore if Form 3 of the Conjecture were true a subgroup ©, would 
exist with exact order p™ which would imply: § = 8s?, ©  3.?""" which is 
Form 2 of the Conjecture. Therefore Forms 2 and 3 are completely equivalent. 


UNIVERSITY OF NEW MEXIco. 


REMARKS ON MULTIGROUPS.* 


By J. E. Karon and OYstTeEIn ORE. 


The present paper may be considered as a supplement to a recent paper 
on multigroups by Dresher and Ore.’ It contains various contributions which 
lead to simplifications and improvements in certain parts of the previous 
theory of multigroups. The notations and terminology are the same and need 


therefore not be explained here. 


1. Existence of cross-cut. In the following let Yt denote a multigroup 
and let 2% and 8 be submultigroups. The theory of submultigroups differs 
from ordinary group theory in that the cross-cut (2,8) may be void. On the 
other hand certain theorems in the theory of normal submultigroups require 
that the cross-cut of particular submultigroups shall not be void; hence this 
must be stated as a separate condition, or it can be fulfilled by assuming that 
the multigroup contains units. We shall show however that in some of the 
most important cases these conditions are not necessary because the existence 


of a cross-cut follows from: 


THEOREM 1. Let U be a left reversible and B a left closed submultigroup 
of WM. Then the cross-cut (A,B) ts not void. 


Proof. Let a and b be elements in %& and B respectively and let us 


determine m such that 
am b. 


By the reversibility of 2 follows m C a,b or 
aa,b a.b b 


and here a, must belong to B since &B is left closed. 
From Theorem 1 follows further: 


in M. Then every element in the union [%, B] is contained in a product ab. 


THEOREM 2. Let & be normal and left reversible while B is left closed 


Proof. It is obvious from the definition of normality that any element 
not in 9 or B must be contained in such a product and for the elements in 


* Received April 25, 1939. 

1 Melvin Dresher and Oystein Ore, “Theory of multigroups,” American Journal of 
Mathematics, vol. 60 (1938), pp. 705-733. We shall quote this paper in the following 
as D. and O. 

6% 


| 


68 J. E. EATON AND OYSTEIN ORE, 


YM and B it follows from the existence of an element d belonging both to 
and 

On the basis of these two theorems one obtains the main properties of 
normal submultigroups: ? 


Let & and B be normal reversible submultigroups. Then the union 
[M, B] is also normal and reversible and the cross-cut (%,B) is normal and 
reversible in and in B. 

If & is normal and reversible and 8 closed in the union [M,B] then 
(2, B) is normal and reversible in 8 and there exists an isomorphism between 
the quotient systems. 


= B/(M, B). 


2. Homomorphisms. A multigroup Yt is said to be homomorphic to 
another 2t* when there exists a correspondence m—> m* between their ele- 
ments such that 

ab Dc 
implies 
Furthermore every element of {%* shall be the image of some element of Mt. 

One proves (D. and O. Theorem 12, Chapter 2) that if Yt* contains a 
left scalar unit e* then all elements % of Yt corresponding to e* form a right 
multigroup which is right closed and if e* is an absolute unit element of t* 
then %f is a closed submultigroup. 

In order to derive further properties of the homomorphism it is necessary 
to make assumptions on the inverse correspondence from Qt* to Mi. 

We shall say that Mt is left properly homomorphic to Mt* when: 


1. M* contains a left scalar unit e*. 
We denote by & the right closed right multigroup consisting of the 
elements in Yt corresponding to e*. 


2. If m*, = m*, then there exist elements a, and a, in & such that 


This condition shows that %& is left reversible. Hence there exists a coset 
expansion of {t with respect to 2% (D. and O. Theorem 9, Chapter 2) and it 
is easily shown: 


THEOREM 3. If a multigroup Mt is left properly homomorphic to another 
M* then M* is isomorphic to a quotient multigroup 


2 These are somewhat simplified statements of the results in D. and O., chap. 3, § 2. 


an 


h 
Ww 
in 
ex 
th 
als 
i to 
to 
(1 
ho 
m 
sh 
gre 
34 
in 
giv 
me 


REMARKS ON MULTIGROUPS. 69 


M* = M/W 
where XU ts a left reversible right submultigroup of Mt.* 


In the paper by Dresher and Ore also the following type of homomorphism 
has been introduced: A multigroup Yt is strongly (left) homomorphic to M* 
when any relation 


implies that to any by and ¢ corresponding to b* and c* respectively there 
exists some a corresponding to a* such that . 


abo Co. 
One can then show: 


THEOREM 4. Let Wt be (left and right) properly homomorphic to M*. 
Then Mt is also strongly homomorphic to M*. 


Proof. When % is both left and right properly homomorphic to Yt* 
there must exist an absolute unit e* in 9* and those elements in Yt which 
correspond to e* form a reversible submultigroup %. But in this case % must 
also be normal since all m with the same image m* must be contained both in 
acoset mM and a coset Mm. From Theorem 3 it follows that Yt* is isomorphic 
to the quotient multigroup M/%. To show that Mt is strongly homomorphic 
to M/W let us assume that a relation 


holds for three cosets. Any m corresponding to m,% may be taken as the 
multiplier of this coset and similarly for m,%. Hence we shall only have to 
show that (1) implies the existence of some element x in m.%M such that 


mz 


and this follows from the normality of 2. 
This also implies D. and O. Theorem 1, Chapter 3. 


3. Strong normality. An important concept in the theory of multi- 
groups is that of strong normality. This concept has been defined (D. and O. 
$4, Chapter 3) under the assumption that right and left units and hence 
inverses exist in the multigroup Wt. We shall show here that one can also 
give alternative definitions in which these assumptions are not necessary. 


DEFINITION. A closed submulligroup X of Wt is strongly normal rf for 
any m, there exists an m’, such that 


*This theorem is a corrected form of Theorem 13, chap. 2 in D. and O. In all state- 
ments on p. 721 strong homomorphism should be replaced by proper homomorphism. 


‘ 
4 


70 J. E. EATON AND OYSTEIN ORE. 


(2) YD m,Am’, 


and to every mz an m’, such that 


(3) m’. Ame. 


Let us derive some consequences of this definition. We determine m,” 
such that 
(4) m,’ Wm,” 
and from (2) one obtains 


(5) m Am! Am” = Am,” = mA. 
When this is substituted in (4) one finds further 


YO m’, 
and since %& is closed 

m’,m, 
Similarly one has 

mom’, 


We show next: 
A strongly normal submultigroup is reversible. 


If namely 
a,m, Me 


then one can determine some z such that 


Mm, 
or 
> Mo. 


When this relation is multiplied by m’, one finds a relation of the form 
A202 As 


showing that z belongs to %& since YF is closed. 
A strongly normal submultigroup is normal. 


From the reversibility it follows that right and left coset expansions of 2 
with respect to 2 exist and since each coset contains its multiplier we obtain 
from (5) 

Wm, = mA. 


Let us finally consider the quotient multigroup 9t/%M. From the con- 
dition of strong normality it follows that for any m, and mz one can find an ms 


such that 


| 
| 


REMARKS ON MULTIGROUPS. 71 


Wm; Ame. 


This indicates however that the product of two cosets contains only a single 
coset ; hence the multigroup Yt/% is an ordinary group. 


THEOREM 5. The necessary and sufficient condition that a multigroup 
Mt be homomorphic to a group & is that M contain a strongly normal sub- 
mulligroup such that 

G = M/A. 

We have already proved the sufficiency of this condition and the necessity 
follows by the same argument as in the proof of Theorem 12, Chapter 3 in 
D. and O. 

Let us prove finally: 

THEOREM 6. The strongly normal submultigroups form a Dedekind 


structure. 


Proof. Since the Dedekind relation holds for normal submultigroups 
(D. and O. Theorem 5, Chapter 3) we shall only have to show that the 
submultigroups form a structure. 

The union of two strongly normal submultigroups 2% and 8 is closed 
(D. and O. Theorem 5, Chapter 2) and [%, 8] = MB (Theorem 2). To any 


m let m’ be determined such that mm’ C Xf. Then 
mMABm’ = mm’AB AB. 


The cross-cut D = (%, B) is also closed (D. and O. Theorem 4, Chapter 


2). Let m and m’ be arbitrary. Then 
Ar mDm’, Be D mDim’ 


for any z in mm’. If m’ is determined such that mm’ > d, where D contains 
d, then 
YO mDm’, mDm’ 
and hence 
DO mDm’. 

Theorem 6 again implies the existence of a unique minimal strongly 
normal submultigroup %f such that M/W, is a group. 

To conclude let us remark that a submultigroup %& can also be said to be 
strongly normal if it is left reversible and the relation (2) holds. This 
definition can be shown to be equivalent to the preceding. 


YALE UNIVERSITY. 


13 


ON THE IMBEDDING OF ONE SEMI-GROUP IN ANOTHER, WITH 
APPLICATION TO SEMI-RINGS.* 


By H. S. VANDIVER. 


In other papers’ a semi-group was defined as a set of elements closed 
ander an associative operation and for which the equivalence and the substi- 
tution postulates hold. In the present paper we shall employ instead of the 
substitution postulate, the postulate that if A—B, then CA—CB and 
AC = BC for any A, B or C in the set, which we shall call the composition 
postulate.? 

A gruppoid * is a semi-group with an identity element, that is, an # such 
that, AK = HA =A for any A in the set. A quasi-group is a semi-group 
such that from either of the relations 


and AB = AC 
BD =CD 
we infer B=C@ 


where each letter denotes an element of the semi-group. Cancellable elements 
in a semi-group S are elements C such that if Cl’ =CWN then M = N or if 
HC = KC then H = K, where each letter denotes an element of S. It is 
easy to see that the product of two cancellable elements in S is also cancellable: 
hence these elements form a sub-set of S which is a quasi-group. Isomorph- 
ism between two semi-groups is defined in the same way as for the isomorphism 
between two groups, and the central of S will be the set of elements in S 
which are permutable with each element of S, as in group theory. A semi- 
group 8 will be said to be imbedded (or immersed) in another semi-group 
S’ if 8’ contains a sub-semi-group which is isomorphic to S. 

Graves ‘ in a recent paper showed how to immerse a commutative quasi- 


* Received July 3, 1939. 

1 Vandiver, Proceedings of the National Academy of Sciences, vol. 20 (1934), p. 579; 
Bulletin of the American Mathematical Society, vol. 40 (1934), p. 916; American Mathe- 
matical Monthly, vol. 46 (1939), p. 24. 

2 The relations between these two sets of postulates and to other sets of postulates 
for a semi-group I hope to discuss elsewhere. 

8 Here we follow the terminology used by Specht and Garrett Birkhoff. Cf. the 
latter, Annals of Mathematics, vol. 35 (1934), p. 351 and note references there given. 

*American Mathematical Monthly, vol. 45 (1938), pp. 664-69. Graves calls the 
system I have called a quasi-group a semi-group. 


72 


{ 
1 
t 
| 
i 
a 
| 
t 
( 


ON THE IMBEDDING OF ONE SEMI-GROUP IN ANOTHER. 73 
group in a group. In the present paper we prove a generalization (Th. 1) 
of this result and apply it to the theory of semi-rings. 


THEOREM 1. Jf a semi-group S contains a cancellable element, and all 
the cancellable elements of S belong to its central, then S may be imbedded 
in a gruppoid S’ whose cancellable elements form an Abelian group G, and 
the identity element of G is the identity element of S’. 


For proof, let the distinct elements of S be denoted by 
1, 
and in particular the distinct elements of this set which are cancellable by 


since the latter, by hypothesis is not a null-set. 
Then consider the set 8’ formed by the pairs 


(ai, Ce) 
and write 


(3) (Gi, Cs) (Ax, Ct) = (Bid, CsCt). 
Also we shall agree that 
(4) (an, Cr) (a1, Cv) 
if and only if 
AnCy aiCr. 

By (3) the closure law holds for the pairs since the closure holds for the a’s, 
and also for the c’s. We now examine the equivalence postulates. 

(Qi, Cs) = (Gi, Ce) 
obviously holds from (4). Symmetry obviously holds since it holds for the 
a’s, As for transitivity, if 


(ai, Cs) (aj, Ct) 


and 

(aj, Ct) = (4k, Cr) 
then 
(5) = AjCg,  AjCr = AC; 


whence, since composition holds in S, 
AaictCr = AjCsCr 


and if the c’s belong to the central of S then 


} 
| 
| 


V4 H. S. VANDIVER. 


QiCrCt == 
or from (5), using composition and transitivity in 8, 

and since c; is a cancellable element in S then 


AiCr = 
and by (4) 
(di, Ce) = (dx, Cr). 


Transitivity then holds. We also have that if 


(6) (di, Cs) as (ax, Cr) 
then 
(7) (di, Cs) (aj, Cr) ar (dk Cr) (aj, Ct), 


for we have from (6) 
AiCr = 
and 
= jCt 
or 
whence we obtain (7), using (3) and (4). Hence composition holds since 
(7%) holds for the reverse order of the factors. It is easy to verify that the 
associative law holds for the pairs, hence, with the above conclusions, we have 
proved that S’ is a semi-group. We shall now show that S’ is a gruppoid. 
Since 8 by hypothesis contains a cancellable element, say c, we note that 


(8) (di, Cs) (c, c) (aic, 
and also 
whence 
(aic, Cec) = (Ai, Cs) 
and (8) gives 
(i, Cx) (¢, 0) = (ais Ce). 
We find similarly that 
(¢, c) (4i, Cs) (ai, Cs) 


so that (c,c) is an identity element of S’, and S’ is then a gruppoid. 
We shall now show that all elements of the form 


| 


ON THE IMBEDDING OF ONE SEMI-GROUP IN ANOTHER. %5 


(Cs, Ct) 
are cancellable in 8’. For suppose that 


(Cs, Ct) (Gi, Cr) = (Cs, Ct) (Aj, Cv) 5 
then 
(Csi, = (CsA, Crev) 
by (3), and (4) gives 


= CgAjCtCr 


and since the c’s belong to the central of S and are also cancellable in S we have 
AiCy = AjCry 


or 
(ai, Cr) (aj, Cy). 


We obtain a similar result for the other order of the factors. 
Hence (Cs, ¢:) is a cancellable element in S’. 
It will now be proved that any element of the form 


(ai, Cs) 
is non-cancellable in S’ if a; is non-cancellable in S. For in that case there 
exist elements aj and a, with aj; ~ a, and 
(9) = 04.04, 
or there exist elements aj, and dx, with 4;, and such that 
(10) = Vidk,. 
Now if (9) holds we have 

(Aji, CCs) = CCs) 
or 
(a;,C) (Gi, Cs) = 
with 
(aj,c) (a, ¢). 

We obtain the same result after treating (10) in a similar way. 

Hence all the cancellable elements of S are of the form (¢s,¢+). These 
form a group @ in S’ since for given elements (€s,¢+) and (en, ci) we may 
verify that 

(Cs, Ct) (CaCt, CoCr) = (Cr, 


and similarly for the other order of the factors on the left, and the result 
follows if we note the product of two cancellable elements in S is a cancellable 


76 H. S. VANDIVER. 


element in S. Also the group is Abelian since the c’s are commutative in 
S. We now show that 
Gy, 2,° 
is isomorphic with 
(@,C,C), (aC, C),° °°. 
To show this we note first that the elements of the last set are distinct since 


the a’s are, for 
(aic,c) = (ajc, c) 


gives 
aic? = 
or 
ai = 
Now if 
(anc, C) an3 = 1, 2,3,-- 
and 
(aic,c) (ajc, c) = (axe, c) 
then 


= Ay 


and conversely ; hence the sets are isomorphic. Also the identity element of G 
is (c,c), so that the identity element of G@ is the identity element of S’ and 
our theorem is proved. 

We note the peculiarity that in order to prove the transitivity law in 8S’ 
we make use of the fact that the c’s are cancellable in S. We may note that 
if we use a non-cancellable element n selected from the a’s in connection with 
the pairs of the type 

(a, 


then it is possible to select a particular set such that the law of transitivity 
does not hold within it. For let S contain an annulator, say k&, then if we use 
the definition of equality as in (3) we have 
(k,k) = (ai, ¢;) 

for any and j, but transitivity does not hold since (aj, (ds, fori s 
and c; a cancellable element. Since the pairs obviously obey laws similar to 
those which fractions follow under multiplication in ordinary arithmetic, we 
see that the above situation is similar to what we have in arithmetic when 
we attempt to employ the fraction 0/0. 


Application to semi-rings. Following closely another paper ® we define 


° Proceedings of the National Academy of Sciences, vol. 21 (1935), p. 162. 


| 
( 
1 
4 


ON THE IMBEDDING OF ONE SEMI-GROUP IN ANOTHER. 77 


a semi-ring as a system of elements which form a semi-group under addition, 
a semi-group under multiplication and the right and left distributive laws 
hold. Let S now form a semi-ring and employ the notation 


= 
for (a,c) and define addition of these symbols by means of 


(11) 
C1 Ce C1C2 

and call 


ay Ae 


multiplication. It is easily seen that addition is associative since S is a semi- 
ring. The distributive law holds since 


Ci \Co C3 C1 C2C3 


(4,42) (C13) + (4,43) 


(€1€2) 


since c is a cancellable element under multiplication in S. But the last 
expression on the right equals 

and similarly we find 


a3 \ A304 
C2 C3 Cy CoC, 


It is easily seen then, that S’, consisting of the elements a/c, is a semi-ring. 
Hence we have the 


THEOREM 2. A semi-ring R whose multiplicative semi-group S contains a 
cancellable element, and the cancellable elements of S belong to the central 
of S, may be imbedded in a semi-ring R’ whose multiplicative semt-group is a 
gruppoid. The cancellable elements of this gruppoid form a group whase 
identity element is the identity of the gruppoid. 


We shall call R’ the quotient semi-ring of R. 
Suppose now that FR is a ring, then since its additive semi-group is an 


78 H. S. VANDIVER. 


Abelian group using (11) we see that R’ forms an additive Abelian semi- 


group since # does and it is a group since 


ay 
— 
C1 Ce 
has the solution 
AoC, 
C1 C2 


and the word semi-ring may be replaced by ring in the statement of Theorem 
2. Also we see that any non-cancellable element n of the multiplicative semi- 


group of a ring is zero or a zero divisor in the ring, since 


na = nb 
with aA bd gives 
n(a—b) =0; 


and n is zero or a zero divisor since a—b~0. A ring where division is 
always uniquely possible when the divisor is not zero or a zero divisor is 
called a quasi-field by the writer in another paper.° This term however is 
employed in a different sense by other writers.7. If R is a realm or domain 
of integrity then FR’ is a field called the quotient field of 2. This special case 


of Theorem 2 is well known.® 


JNIVERSITY OF TEXAS. 


° Proceedings of the National Academy of Sciences, vol. 21 (1935), p. 162. 
*Cf., for example, Albert, Modern Higher Algebra, p. 23. 
* Van der Waerden, Moderne Algebra, |st ed., Bd. 1, p. 7; Albert, loc. cit., pp. 27-29. 


| 
| 
4 
| 
{ 
( 
I 
\ 


NOTE ON EULER NUMBER CRITERIA FOR THE FIRST CASE OF 
FERMAT’S LAST THEOREM.* 


By H. 8S. VANDIVER. 


For the solution of 
(1) a+ +2! = 0 
ryz (mod /); 2, y and z rational integers, a given odd prime, we have 


the Kummer criteria 


Bnfi-2n(t) == (mod l), 


(2) 
fis(t) =0 (mod 1) 
where 
(2a) —t=2/y, y/t, y/2, modulo l, 
(n =1,2,---,(l—8)/2), 
fe(w) = > "w+. 


Further, the B’s are the Bernoulli numbers, B, = 1/6, B, = 1/30, ete. Now 
all the known criteria which have been derived from (2) for the solution of (1) 
and which are independent of z, y, and z may be shown to have a certain 
relation to each other. All may be derived from a set of criteria of the form 
[rl/m] 
(3) C(m,1,r) = =( (mod /) 
s=[(r-1) 1/m]+1 
for certain small values of m and where 1 = /1— 2, | — 3, 1—4, 1— 6, l— 8, 
1— 10, 1—12; with r in the set 1,2,---,m—1, and criteria consisting 
of linear functions of the type (3) with rational coefficients. 
It is known that ? 

(4) bea = (n — 21+ 1)C(n, 2a — 1, 1) 
modulo /, for 1 > 3; n¥0 (mod 1); [h] is the greatest integer in h, and the 
b’s are defined by 

where the left-hand member is expanded by the binomial theorem and by 
substituted for b¢. As is known (—1)**Ba=bo. For n=1—1, (4) gives 


[n/2 


= 2 (n — 20+ 1)C(n, 1— 2,1) 


* Received July 3, 1939. 
1Vandiver, Duke Mathematical Journal, vol. 3 (1937), p. 572, relation 10. 


79 


|_| 


80 H. S. VANDIVER. 


modulo /, and using 


(mod 1) 


we have 
(5) > (n— %+1)C(n, 1— 2,7) 
4=1 


modulol. Frobenius? found criteria of the type (3) for 1 —1— 2, and various 
values of m = 26. Morishima,*® showed that if (1) is satisfied in Case 1 then 


m'-1 == 1 (mod /?) 


for each m such that 0 < m= 31. Employing (5) we obtain criteria of the 
type mentioned above. 

It has been shown, using (2), that for 2n =1— 3, 1—5, 1— 7, 1— 9, 
1— 11, 1— 13, we have the criteria B,==0 (modl) in Case 1. Using (4) 
we obtain more criteria of the type mentioned in connection with (3). Also, 
employing various formulas due to the writer (1. ¢., 572-4) we obtain a number 
of congruences each involving only one of the C(m,1,r). 

In another paper of the writer’s* it was shown that if (1) is satisfied 
in Case 1 then 

[1/3] 

(6) =0 (mod 1), 


f=1 


and it was shown by Schwindt® that this yields the relation 


[1/6] 
(6a) > =0 (mod 


r=1 


From the writer’s article last cited (p. 91, and see also last paragraph in 
article) we also have the criteria 


(1) Fa(t/p) 0 (moat) 


where p is an m-th root of unity and 3 indicates summation, over all distinct 
values ~1, of p;(m,l) =1. Further 


F.(w) =w + 2w? (ml—1)w™™, 


The value m = 3 gives (6). Set m= 4, then we note that 


2 Berlin Sitzungsberichte (1914), pp. 653-81. Cf. also Emma Lehmer, Annals of 
Mathematics (vol. 39 (1938), pp. 358-9. 

3 Japanese Journal of Mathematics, vol VIII (1931), pp. 159-173. 

“Annals of Mathematics, vol. 26 (1924), pp. 88-94. 

5 Jahresberichte Deutscher Mathematische Verein, vol. 43 (1933-4), pp. 229-31. 


no 
Se 
m 
— sO 
(n 
= or 
the 
orl 
= 
an 


EULER NUMBER CRITERIA FOR FIRST CASE OF FERMAT’S LAST THEOREM. 81 


wit — ] 
w—1 
d 1 4lw*! 
_ 2) 
(w—1)? 
where we regard a fraction of the form 61, where the denominator is a poly- 
nomial in w not all of whose coefficients are divisible by 1, as =0 (mod /). 
Set w= 1/p; we find since pt = 1, (mod J), 


=w-+2w?+---+ (mod 


modulo /, and 


(—(t/p) 


modulol. Now 


= = (t*—1) F2(t/p) 


t/p)* — 
== (t/p — p) (t/p — p*) (t/p — p*) 
(t—1) (t—p’*) (t—?*) 


3 ? 


so that (8) becomes, using (7), and noting that (mod /) and 40 
(mod /), 


p! 
or since p® + ++ p+ we have 
(9) + (p+ +p)? (mod). 
This congruence is of degree four and since it is satisfied by all the values (2a) 
then either /?—¢-+ 1=0(modl) or {==—1,2 and 1/2. The first con- 
gruence ® is inconsistent with (2) and t=—1 satisfies (9) identically, but 
{= 2 and 1/2 give in turn 

(4p? + + p) 0 (mod 1), 


>’ (p® + 4p? + 4p) = 0 (mod 1), 


and subtraction gives 


(9a) 


= ( (mod /). 


® Pollaczek, Wiener Bericht, vol. 126 (1917), pp. 1-15. 


6 


| 


82 H. S. VANDIVER. 


Now we have’ 
(4b + 3)'*— 
1—2 
where (mb +-k)" is expanded by the binomial theorem and 0; substituted for 
b‘ in the result. Also 
(45 4-1)** —5;., 
and subtraction of the last two congruences gives 
The left-hand member® is 1-3)/2, where E,=5, 
FE, = 61, ete. are the Euler numbers, and (10) gives with (9a), 


vo 


=(—1)"" (mod l), 
p'— 


(10) 


(11) E (1-3)/2=0 (mod l). 
Emma Lehmer® gave the relation 


[1/4] [1/4] 

ae > (—1) /2 
r=1 T r=1 
modulo J, and remarked that if we could show the left-hand member =0 (mod 1), 
provided (1) holds in Case 1, that (11) follows. Here we may reverse this 
process, as using (11) we have 


1/4] 
> = 0 (mod 


r=1 
as criteria for (1) in Case 1, and this is evidently also included in the class 
of relations (3). Hence we have, using (6a) also, 


THEOREM. If 
at + yt + at 0 


with x, y and z rational integers and xyz £0 (mod 1) ; 1 a given odd prime, 


then 
2 == (0) (mod L) 


where E, = 1, E.=5, FE; =61,- --, are the Euler numbers. Also 
[1/4] 1 
0 (mod l) 


r=[1/6]+1 7 


UNIVERSITY OF TEXAS. 


7 Frobenius, Sitzungsberichte, Berlin (1914), p. 655, formula (2) for n =1—2. 
§ Cf. for example, Frobenius, Sitzwngsberichte, Berlin (1914), p. 846. 
® Loc. cit., p. 359. 


( 
fo) 
( 
W 
te 
t¢ 
| 
( 
d 
C 
(: 
19 
Nc 


ON EXPANSIONS IN SERIES OF EXPONENTIAL FUNCTIONS.* 


By Marvin G, Moore. 


Introduction. Carmichael has expanded functions of exponential type 
in a series of exponential functions (3) associated with the exponential sum 
h(t) in (1). He has pointed out that, for the special case h(t) =e’ —1, 
(3) becomes the Fourier expansion, the natural polygonal region of con- 
vergence reducing to the line-segment (0,1). We are led, then, to investi- 
gate the possibility of generalizing the properties of biorthogonality and the 
convergence theory of the Fourier series to expansions associated with the 


more general functions h(t). 


I. PRELIMINARY CONSIDERATIONS. 
Let 
(1) h(t) = cye™* + coe? ++ + 
where c, #0 and aj ~ %& for 7k, and where N = 2. 
Let P be the smallest closed convex polygon in the complex plane con- 


taining the points @,, %,: - *,%y. In special cases, this polygon may reduce 

to a line-segment. Then Carmichael * has demonstrated the existence of con- 
tours C,, Cs,- + - about the origin having the following properties: first, 


there exists a positive e for which 
(2) | | 


for every x in P and for every ¢ on every (’,; second, if the sectors Sy are 
defined to be those regions in which R(al) > R( a?) for allA (PF being 
the real part), Cs lies along the circle having radius s and center at the origin, 
except for portions of bounded length lying within a bounded distance of the 
rays which separate the sectors Sy; and third, no point of Cs lies outside 
Cor. 

The series with which we are concerned are to be of the form 


(3) 


k=1 


* Presented to the Society, Dec. 30, 1937. Received June 19, 1938; Revised July 1, 
1939. 
1R. D. Carmichael, Transactions of the American Mathematical Society, vol. 35, 
No. 1 (1933), pp. 1-28. 
83 


8&4 MARVIN G. MOORE. 


where the degree of the polynomial /’x,(xz) is at least one less than the order 
of the zero ty, of h(t), and where are those zeroes of h(t) 


lying between C,_;. and C,. 


II. PROPERTIES OF BIORTHOGONALITY. 


THEOREM 1. Let w(k,s) be the order of the zero txs of h(t). Let 
Cus be a small circle passing through no zero of h(t) and containing only 
the zero ty, on its interior. Then, for q=9, 1, 2,- °°, w(k,s) —1, and 


for a any point of the closed region P, 


1 N an e (Qptr-x,)t 

N 
— Dep f f dtdz, 
a Cym h(t) 


(4) § =0 for tis A tim 


== for lig = 


Upon integration with respect to z,, the left member of (4) reduces to 


the form 


1 eltks-tharrt 
— v 
* 


which, by Cauchy’s Integral Theorem, vanishes if tis A tim. - 


If tee = tim, it reduces to 


q q 
> 
4 
where (‘) is the binomial coefficient, and the expression, by the binomial 
Vv 


theorem, equals 
ale t 


We have, then, conditions generalizing the biorthogonality conditions 
pertaining to the Fourier series, here arising when h(t) =e'—1, our 
results in that case taking the form, after the contour integrals are evalu- 
ated: If a is any fixed point of the interval (0,1), then for m and I integers, 


positive, negative, or zero, 


0 
e2lri(r+1) f e2(m-l) mindy, — e2ltiz f 
a 


‘ m-~l 


== for m = 


If a= 0, this reduces essentially to the customary form of the statment 
of the Fourier biorthogonality conditions. 


t 
| 
| 
fc 


ON EXPANSIONS IN SERIES OF EXPONENTIAL FUNCTIONS. 85 


Theorem 1, viewed in the light of the theory of biorthogonal functions, 
suggests the examination of the series (see (3) ) 


CO Gs N 
(5) f f(a) elaure-a)t (h(t) }—dtday, 
g=1 k=1 a Crs 


which we shall call the F-series. Series (5) is of the form (3). 


III. LEMMAS ON CONTOUR INTEGRALS. 


We shall find it convenient to define Py as the region P exclusive of the 
portions | | < » about the vertices. 


Lemma 1. For every positive y, there exists a positive K for which 


| | <K 
C. 
for s=1,2,- --, and for x am Py. 


Since, by (2), the integrand is dominated in absolute value by ¢? at 
every point ¢ of Cs, for all s, and for all x in P, it follows that the portion 
of Cs which does not lie on the circle of radius s, having for its length a 
bounded function of s, contributes a bounded quantity to the value of the 
above integral. 

To show that a bounded quantity is also contributed by each of the 
circular arcs of Cs, which remain; for any sector Sy, let 2—a, = ret¥ 
t = pe'”; where y, ¢ are real and r, p are positive. It follows almost imme- 
diately from the definition of S, that, for z in P and for ¢ in &§), 
R(at) S R(t), so that 

cos(y +o) 0. 


On the are of C, in Sy, as well as at all other points of C, and for all s, 
e* {h(t) }-! | < €*, so that we shall, along this arc, dominate the integral by 


(3/2) 
( 


Since, for 44 = az, cost S — 241 (4 — we may dominate this 
expression by 


T 
2c = arte er?) < 
( 


1/2)7r 


for r= 7. The lemma has then been proved. 


86 MARVIN G. MOORE. 


LemMMA 2. Let 0, be the supplement of the angle of P at ap if ap is a 
vertex of P, and let it otherwise be zero. Then 
emt dt 10 
We shall consider only the case for which ap is a vertex, for otherwise 
the conclusion is a special case of Carmichael’s result that 


(6) lim {h(t)} =0 
for z in P and not a vertex.’ 
Breaking C's, up into C’ and C”, where C” is that portion of Cs in Sy, 


dt, dt di 

Take parabolas * with vertices at the origin and with the bounding rays 
of Sy as principal diameters. Writing t = pe‘”, we then see that the result- 
ing integrands approach zero for ¢ outside the parabolas, for s becoming 
infinite, and are bounded for ¢ inside the parabolas. The ranges of integra- 
tion with respect to ¢ approach zero inside the parabolas, so that the integrals 
with respect to ¢ approach zero. The integrals with respect to p likewise 
approach zero, the integrands approaching zero in that case. 


IV. CONVERGENCE OF THE F-SERIES. 


Let P” be a convex polygon contained in P and let it have the property 
that for every point %(»—1,2,---,N), there exists a point a’, in P’ for 
which oe, + 2—@’, lies in P and not at a vertex of P, for all x in P’ and 
not vertices of P’. In particular, P’ may coincide with P, in which case 
= Gp. 

Let a curve H, from each point 2, to the corresponding point ap be 
made up of a finite number of straight-line segments and let it have the 
property that, for every point z, on Hy, and for every z in P’ and not a 
vertex of P’, a, + 2—~a, lies in P and not at a vertex of P. In particular, 
the curves H, may be taken to be straight lines, although, if straight lines 
are not suitable, we are not, in general, restricted to them. 


? Carmichael, loc. cit., p. 24. 
’Carmichael uses such parabolas for similar purposes, loc. cit., p. 24. 


| 
a 
r 
b 
il 
a 

f 

in 

[ 
wi 

(7 

(1 
str 

of 

(8 

( 

im 

so 


ON EXPANSIONS IN SERIES OF EXPONENTIAL FUNCTIONS. 87 


We shall find it convenient to use the notation 
fla + 0(¢,—2z)] =lim (o> 0;0>0), 


when the limit exists, in the following theorem. 

Let f(z:) be so defined that both its real and imaginary parts are summable 
(L) along each of the line segments of which the curves H, are formed and 
also along every straight line segment in closed P’. Further, if P’ does not 
reduce to a straight line segment, let f(2,) be analytic in open P’ and let its 
integral between any two points of closed P’ (taken along any finite num- 
ber of straight-line segments) be independent of the path of integration 
in P’. Let a be any point of P’ and let the paths of integration L, from 
ato %(u—1,2,---°,N) be made up of the straight lines from a to @’n 


combined with the curves Hyp. 
THEOREM 2. Let the above hypotheses be satisfied. 


Then, first, for every point x on the interior of P’ (if there be such 
points) ; and second, for every x on the boundary, and not at a vertex of P’, 
for which there exists a positive number yn, such that both the real and 
imaginary parts of f(a.) are of bounded variation in the linear interval 
the F-series for f(x) associated 


with h(t) and +,N) converges to 
N 
For x on the interior of P’, this expression is equal to f(x). 


On breaking up the functions f(z,) and 


(7) f. h(t) }dt 


(which we shall write as Q) into their real and imaginary parts for 7, on any 
straight-line segment (8,y) on Ly, since both the real and imaginary parts 


of Q have continuous derivatives with respect to 2, 


(8) f(x.) Qdz, 


(which we shall write as /(8,y) ) may be written as the sum of real and 
imaginary Lebesgue integrals, each of which may be integrated by parts,‘ 


so that 


*E. W. Hobson, The Theory of Functions of a Real Variable, vol. I (1927), r —.6. 


4 


88 MARVIN G. MOORE. 


(9) v dQ 
far, f — ff, f( 22) 


If (8, y) lies in P’, both factors of the first term of (9) are independent of 
the path of integration in P’, while both factors of the integrand in the second 
term are continuous throughout closed P’ and analytic on the interior, so that 
the second term of (9) is independent of the path of integration in P’, and 


(8) must be also. 
Now let a; and a2 be any two choices of the point a. Then, under our 


hypotheses, 
(a, (a, Su) } -f (a1) f tdidz,, 
a Cis 


which, by the Cauchy Integral Theorem, vanishes, so that every term of (5) 
is independent of a, for a in I”. 

We may then (and shall) take a at the point x, writing, by the theory of 
residues, the sum of the first s terms of (5) in the form 


1 
Cw) (x, A), 


where U and J replace Q and I respectively for Cz. replaced by Cs. We shall 
then consider the separate terms (for convenience dropping the subscript ,), 
which may be written as 
(10) fla + — x) 
J) 
, 
Upon evaluation of the first integral in (10), its limit as s becomes 

infinite is seen, by (6) and lemma 2, to be equal to 


1 , 


It will, then, be sufficient for our proof to show that the limit of the second 
term of (10) vanishes. 


We now set 


R{f (x1) — + — x) ]} = Ai (a, a1) — 1) 


for x, on L (where L now joins x to a), A; and A» being monotonic functions 
of x, for in the open interval [a, —«x)], approaching zero as 
z,—>z and being summable on L. Then for every positive € there exists a 


| 
( 
t 
8 
( 
S 


ON EXPANSIONS IN SERIES OF EXPONENTIAL FUNCTIONS. 89 


positive <m for which | Ai(z,2,)| <¢ for in the linear interval 
[x, 2 + (a —~x)], so that, upon using the second mean value theorem,® we 
see that 
A, (x, 0,)R(U)dR(2y) 
may be dominated in absolute value by 


(11) | R(U)dR (2) |, 


where lies in the interval Since dR(z,)/dz, is to be 
constant, we may actually carry out the integration, and then, by (2), we 
find that (11) is dominated by 4xfe'. 
Take now any straight line portion (8, y) of the part of Z not involved 
in (11) and let us consider 


(12) Ade, 2,)R(U)dR(2). 
B 


As we prepare to apply Hobson’s General Convergence Theorem,® we 
note first that, by our hypotheses, « + a — a, lies in P and not at a vertex for 
every 2, on Closed (8,y). It must, then, be bounded away from the vertices, 
so that for every positive » there exists an » for which « + x— 42, lies in P3. 
By lemma 1, there then exists a positive K for which | U | < K for all s, so 
that | R(U)| is also bounded, and Hobson’s first condition is satisfied. 


Noting that 
f Udz, 
B 


we carry out the integration of U with respect to 2, and then apply (6) to 
show that the second condition is also satisfied. 
Then, for every positive 7, (12) approaches zero as s becomes infinite. 
We may then combine the finite number of line-segments which form L to 
obtain the result: For every positive 7 and for every positive ¢, there exists a 


positive integer §, for which 


_ |2(y) —#(B) 


< | f (a1) 


(13) | "Ax (2, UdR(a)| < 


for > 


°K. W. Hobson, loc. cit., p. 618. 

® Reference will be made to E. W. Hobson, The Theory of Functions of a Real 
Variable, vol. II (1926), p. 422; see also Proceedings of the London Mathematical 
Society (2), vol. VI (1908), p. 349, and (2), vol. XII (1912), p. 166. 


90 MARVIN G. MOORE. 


We now note that the second term in (10) may, by the separation of the 
real and imaginary parts of the factors of the integrand, be divided up into a 
finite number of parts each of which may be given essentially the same treat- 
ment as we have given the integral in (13), so that the second term in (10) 
approaches zero as s becomes infinite, and the F-series converges to 


2 2 + 


In particular, if h(t) =e'—1 so that we are dealing with the Fourier 
series on the interval (0.1), we find the series to converge to 


4-9) + f(x — 0)}. 


It may be shown by similar methods, for P’ coinciding with P, that, at 
the vertex %, the F-series converges to 


In particular, the Fourier series converges to 


3{f(1— 9) + f(+ 0)} 


at either end-point of the interval. 


INDIANA UNIVERSITY, 
BLOOMINGTON, INDIANA. 


mo 
poi 


vol 


Fe: 


C 
Wl 
eq 
of 
fu 
wi 
| 
bo 
fo 
| fu 
re! 
eff 
ex 
Hi: 
Tl 
Pr 
= 
= 


EXTREMAL PROBLEMS FOR FUNCTIONS ANALYTIC AND 
SINGLE-VALUED IN A DOUBLY-CONNECTED REGION.* 


By Maurice H. HEINs. 


1. Introduction. It is well-known that certain fundamental inequalities 
of analysis such as Julia’s Principle of the Harmonic Majorant,’ the Two 
Constant Theorem,’ Lindeléf’s Principle,’ the Principle of Hyperbolic Measure * 
are the “best possible” when the domain of definition @, for the functions 
w(z) involved is simply-connected. When one considers, however, functions 
which are analytic and uniform in a multiply-connected region, these in- 
equalities are, in general, no longer the “ best possible”; it is then a question 
of interest to determine effectively exact bounds and the associated extremal 
functions for these inequalities when we restrict our attention to functions 
which are analytic and single-valued in a given multiply-connected region G;. 

By application of the Poincaré Uniformisation Theorem ® and the Pick- 
Nevanlinna theory of interpolation ® one can determine effectively the exact 
bounds and the associated extremal functions for the inequalities cited above 
for the case where Gz is doubly-connected and has as its boundary two disjoint 
continua. To this end we shall study the problem of interpolation for bounded 
functions which are analytic in the unit circle and satisfy a given functional 
relation. By the results of this study we shall give a method for determining 
effectively the exact bounds at a given point z of G, and the associated 
extremal functions for the following inequalities: 1) Julia’s Principle of the 
Harmonic Majorant of which the Nevanlinna-Ostrowski Two Constant 
Theorem and Hadamard’s Three Circle Theorem are special cases, 2) the 
Principle of Hyperbolic Measure of which the Aumann-Carathéodory “ Starr- 


* Received March 6, 1939. 

1G. Julia, Principes géométriques d’analyse, 2ieme partie (Paris, 1932), pp. 26-27. 

*R. Nevanlinna, Hindeutige Analytische Funktionen (Berlin, 1936), pp. 41-42. 

°K. Lindeléf, “ Mémoire sur certaines inégalités dans la théorie des fonctions 
monogénes et sur quelques propriétés nouvelles de ces fonctions dans le voisinage d’un 
point singulier essentiel,” Acta. Soc. Sci. Fenn., 35 Nr. 7 (1908). 

*R. Nevanlinna, /. c., pp. 45-51. 

5H. Poincaré, “ Sur l’uniformisation des fonctions analytiques,” Acta Mathematica, 
vol. 31 (1907). 

*R. Nevanlinna, “ Ueber beschrinkte analytische Funktionen,” Ann. Acad. Sci. 
Fenn., vol. 32, No. 7. 

91 


° 


92 MAURICE H. HEINS. 


heitssatz ” 7 is a special case. In addition, our preliminary study permits the 
complete treatment of the analogue of the Pick-Nevanlinna problem for the 
case where the interpolating functions are analytic and single-valued in a 
doubly-connected region. Lammel has considered related interpolation 
problems. 

Recently Carlson* and Teichmiiller*® have considered the problem of 
improving the Hadamard Three Circle Theorem for functions which are 
uniform. Their methods are quite distinct from ours, which admit application 
to other problems as well—the extremal problem for the Principle of Hyper- 
bolic Measure and the Pick-Nevanlinna interpolation problem for doubly- 
connected regions. 

The author wishes to express his thanks to Professor Walsh for his 
helpful discussions during the preparation of this paper. 


2. The Pick-Nevanlinna theory of interpolation. In this section we 
shall state briefly the principal results of the Pick-Nevanlinna theory of inter- 
polation important for the sequel. For a detailed account of this theory the 
reader is referred to the treatise of Walsh.’ 

Let € denote the class of functions w(z) analytic for |z| <1 and 
w(z)|=1. Let @ be any complex number 


satisfying there the inequality 
for which |a| <1. We denote by L(z,«) the linear fractional function 
a—z 
1 — az" 
Further let the points - -, be given interior to the unit circle | z | = 1 


and let there be associated with each z, a complex number wx’, | w, | <1 
(k = 1,2,- Define w,")),- . by 


(2.1) L(w,, w,) L | 2, | wy) 

(where Le rn 2,) is to be replaced by z, if z,=0), and, in general, 
w,) (k=v+1,--+,n) by the recursive formula 


*G. Aumann and C. Carathéodory, “ Ein Satz iiber die konforme Abbildung mehr- 
fach-zusammenhiingende ebene Gebiete,” Mathematische Annalen, vol. 109, pp. 756-763. 

* F. Carlson, “Sur le module maximum d’une fonction analytique uniforme,” Ark. 
fér Mat. Astron. och Fys. Bd. 26, 2A9, pp. 1-13. 

°O. Teichmiiller, “ Eine Verschiirfung des Dreikreisesatzes,” Deutsche Mathematik 
vol. 1 (1939), pp. 16-22. 

7° J. L. Walsh, Interpolation and Approximation by Rational Functions in the 
Complex Domain, New York, pp. 286-304. 


ad 


fu 


1) 


or 


0 
a 

( 
( 
fu 

|| 

by 


EXTREMAL PROBLEMS. 93 


(2. 2) wy) = L (2x, 2v) L(we™, | 2v | wv?) 
(k=v+1,---,n) 


| 29 | L(z,2v) is to be replaced by z, if zx 0). Then we have 


Zy 


(where 


THEOREM 2.1. A necessary and sufficient condition that there exist a 
function w(z) C € for which w(z) = we" where the z are n distinct points 


interior to | z| = 1 is that either 


or 2) | | <1, | ws | 1,- +, | | <1. 


If 1) occurs, w(z) which satisfies the interpolation requirements is unique 
and 1s given by the formulas (2.1) and (2.2) in conjunction with 


L (2, 21) L(w,(z), | | wi) 


(2.3) L(wo(z), 
1 


(2.5) wv(%) = ==y+1,---,n), 


where wo(z) =w(z). 
If 2) occurs, w(z) is not unique. All such functions and only such 
functions are given by the formulas (2.2), (2.4), and (2.5), where wn(z) 


is any function of class €. 


Further, if 2:,22,- - (| 1) are infinite in number, Theorem 2.1 


admits the extension 


THEOREM 2.2. A necessary and sufficient condition that there exist a 
function w(z) C € for which w(z) = (k =1,2,- -) ts that either 


Wi) == wh) 


If 1) occurs, w(z) with the required properties is unique and is given 
by the recursive formulas (2.2), (2.4), and (2.5) where wy(z) =w™, 


In the situation of Theorem 2.2 we have 


‘ 


94 MAURICE H. HEINS. 


THEOREM 2.3. If there exists a function w(z) C € satisfying the inter- 
polation requirements of Theorem 2.2, then all w(z) C€ satisfying these 
requirements can be expressed in the form 


P(z) — Q(z) wo (2) 


1— S(z)wa« (2) 


(2. 6) w(z) = 


where P, Q, and 8 are specific functions of class € defined by the interpolation 
requirements and Wo(z) is an arbitrary function of class €. And conversely, 
every function w(z) defined by (2.6) where woo(z) is an arbitrary function 
of class E belongs to class E and satisfies the interpolation requirements of 
Theorem 2. 2. 


A necessary and sufficient condition that wCé€ satisfying the inter- 
polation requirements of Theorem 2.2 be unique is that 


PS—Q=0. 


Remark. If PS—Qs40, the function PS —@Q vanishes at the points 
z, and at no other points for | z| < 1. 


3. A particular interpolation problem. We now turn our attention to 
a particular interpolation problem which is fundamental for the study that 
we shall make. Let 7z(s4z) denote any linear fractional transformation 
mapping | z | <1 onto itself (in the sequel we shall consider exclusively the 
case where 7’ is hyperbolic), and let Uz denote a second such transformation, 
but here we do not require that Uz 4z; in fact, the case where Uz =z is of 
prime importance. We wish to study those functions w(z) C € which satisfy 
certain interpolation requirements at assigned points 7 (k =1,2,---,n or 
k ==1,2,-- -) and which satisfy for | z| <1 the functional relation 


(3. 1) [ w(T) = | 


We shall demonstrate the following 
THEOREM 3.1. A necessary and sufficient condition that there exist a 


function w(z) C € for which 


W( Ze) = we, | <1, w(T = 
== + 2,---) 


(it is assumed that these interpolation requirements are consistent) and which 
satisfies the functional relation (3.1) is that there exist a function w*(z) CE 
which satisfies the interpolation requirements for w(z). 


| 
| 
] 
/ 
d 
T 
\ 
| 
| 
| 


EXTREMAL PROBLEMS. 95 


It is clear that this condition is necessary. 
To prove that it is sufficient we note that if w*(z), the existence of which 
is posited, is unique, then 
w*(T') = U[w*(z)], 


for U"[w*(T)] C€ and satisfies the same interpolation requirements as 
w*(z). Therefore 


U-[w*(T)] w*(2) 
or w*(T) = U[w*(z) ]. 


If w*(z) is not unique, let {w*(z)} denote the totality of functions 
w*(z) which satisfy the interpolation requirements and let z) be any point 
distinct from all the points 7’”z; 


or m0, 41,4 2,-- -). 


Then {w*(z))} is the totality of values which the functions w*(z) C {w*(z)} 
take on at zp. Consider the set {U-[w*(Tz.)]}. We assert that 


(3. 2) {w* (20) } = {U"[w* (T20) ]}. 


Let w*,(z) C {w*(z)}, then it follows that U-'[w*,(T)]C € and satisfies 
the interpolation requirements; therefore U-*[w*,(T2z)]C {w*(z.)} and 


therefore 
{U*[w* ]} {w* (20) }- 


Also if w*,(z) C {w*(z)}, then w*,(z) =U"[U[w*,(T'Tz)]]. But 
U[w*.(T-')] C € and satisfies the interpolation requirements. Therefore 
C {U[w*(T2.) ]} and it follows that 


{w* } C {U[w* (T20) ]}. 


Therefore the relation (3.2) is verified. From this fact we shall deduce 
certain functional relations between P(z), Q(z), S(z) and P(T), Q(T), 
S(T). Let us note that since the solution of the interpolation problem is not 
unique and since 2 is distinct from Tz (k =1,2,°--:,n or ; 
+1, + 2,-- +), P(%)Q(%) —S(%o) (ef. remark Theorem 2.3), 
and therefore the set {w*(z )} fills a proper circle which we shall denote by 
K;,. This follows from the formula (2.6) where w(z) is replaced by w*(z) 
and the statement of the interpolation requirements of Theorem 2.3 is 
replaced by the statement of the interpolation requirements of the theorem 
Which we are to prove. The transformation 


96 MAURICE H. HEINS. 


P (Zo) — Q(%o) W* 
1 — w* 


maps | w*« | onto K,,; and the transformation 


P(T%) — | 
1— 


(3. 3) w* = 


(3. 4) w* 


maps | w*« | <1 onto Kz, by virtue of the relation (3.2). Therefore the 
transformation 


P —Q(%o)t — QO(T20)r 
1— [ 1— 8S(T2)r | 


(3. 5) 


is non-degenerate and maps |¢| <1 onto |r7|=1. Let us write U(z) in 


the form 
(z— a) /(az—1) (ja|<1,0S6< 2z). 


Then (3.5) takes the equivalent form 
Q (2) — aS (20) t] 


P(T zo) —Q(T%)7 %P(z) —1 a@P(Z) —1 
1— 8(T%)r (Zo) — 8 (2) 
@P(z) —1 


It follows that (3.6) can be written in the form 
t= [A(Z0) + €(%0)t]/[1 + A(Z0)€ (20) 


where A(z) and e(z) are suitably chosen, | A(Z)| <1, | €(%)|=1. A 
necessary and sufficient condition that the transformation (3.6) can be 
written in this form is that the following equations be satisfied : 


e(P(%) —%) P(T%) —A(%) (T20) 
aP(z%) —1 1—A(%)S(T%) 
(3.7) (20) — aS (zo) ) Q (Tz) —A(%) P(T%) 
@P(z) —1 1—A(%)S(T2) 
Now the equations (3.7) with the subscript dropped determine e(z), (2), 
X(z) as functions of z, single-valued, analytic and defined for all values of z 
interior to the unit circle |z|—1 other than the {7"z}. The conditions 
| e(z)| =1 and A(z)A(z) real for every such z imply that « and are con- 


1 Ibid. In particular, pp. 296-304. 


( 

| 
| 
fi 
U 
n 
t) 
h 
a 
Sé 

ki 
at 
(; 
al 
is 
is 
eq 


EXTREMAL PROBLEMS. 97 


stant. Therefore the relations (3.7%) are valid for all , | z | <1 where «, A 
are constant. 

Suppose now that w(z) satisfies the required interpolation conditions 
and further the functional relation (3.1). Then w(z) can be written in the 
form 

(2) ((2)wa(z) 


3.8 (2 
(3. 8) w(z) 


and wx (z) satisfies the functional relac, 


? 


(3.9) P(T) —Q(T)wa(T) Ee (z)wa(z) 
1— S(T )wa(T’) 1 S(z)wa(z) 
and from our discussion we infer that 


(3. 10) wo (T) = U*[wa(z) ], 
where 


U*(r) = (A+ er)/(1 + Aer). 


Conversely, if woo(z) C € and satisfies the functional relation (3.10), 
w(z) as given by (3.8) satisfies the required interpolation conditions and 
further the functional relation (3.1). Now there always exists a function 
Wo(z) C € which satisfies the functional relation (3.10). For since U* 
maps the closed interior of the unit circle onto itself, it is either the identical 
transformation or a transformation of one of the following types: elliptic, 
hyperbolic, parabolic. If U* is the identical transformation, the existence of 
a function wx(z) C € satisfying (3.10) is evident; any constant k,|%|=1 
satisfies (3.10). If U* is not the identical transformation, then it is well- 
known from the theory of linear fractional transformations that U*(7) has 
at least one fixed point in the closed interior of the unit circle |+|<1. Let 
t* be a fixed point of U*(r) ; then it is evident that r* satisfies the relation 
(3.10). (We shall return to the study of the functional equation (3. 10) 
and consider the possibility of non-constant solutions.) Thus Theorem 3. 1 
is established. 

Let us remark that when w*(z) satisfying the interpolation requirements 
is not unique, there is a one-to-one correspondence between the functions 
Wo(z) satisfying the functional relation (3.10) and the functions w(z) 
which satisfy the interpolation conditions of Theorem 3.1 and the functional 
equation (3.1). 

Denjoy has given the foilowing criterion for the uniqueness of w(z) of 


7 


98 MAURICE H. HEINS. 


Theorem 2.2: A necessary and sufficient condition that the function w(z) 
of Theorem 2.2 2) be unique is the divergence of the series 


(3. 11) 


Let us remark that in the applications which we shall make of Theorem 
3.1, we shall consider exclusively the case where 7 is hyperbolic. We shall 
demonstrate that if 7’ is hyperbolic, a necessary and sufficient condition that 
w(z) of Theorem 3.1 be unique is that the function w*(z) of Theorem 3. 1 
be unique. But a necessary and sufficient condition for the uniqueness of 
w*(z) and therefore for the uniqueness of w(z) is the criterion of Denjoy 
(3.11) where the notation is suitably modified. 

As we have shown, if w*(z) of Theorem 3.1 is unique, then w(z) is also 
unique. Suppose now that w*(z) is not unique. We shall show that w(z) 
is not unique. 

Let us recall that there is a one-to-one correspondence between the w(z) 
of Theorem 3.1 and the functions wo(z) C € which satisfy the functional 
relation (3.10). If U* of the relation (3.10) is the identical transformation 
or hyperbolic, there is more than one solution of (3.10) which belongs to 
class €. Two cases remain. U* may be parabolic or elliptic. But in these 
cases the relation (3.10) may be reduced to the following canonical forms: 

A) U* elliptic 


f(Az) = ef (z), R(z) > 9, 
6 real, dA positive (1), 
B) U* parabolic 
f(z) =—f(2) >0, BO 
p real, A positive (~1). 
Let us consider Case A). It is clear that the function 
log z/log 
satisfies the equation 
f(Az) = ef (z) 
where K is a constant. We seek solutions f such that | f| <1 for R(z) >. 


| log z/log | | K | tare z/log 


12 A, Denjoy, “Sur une classe des fonctions analytiques,” C. R. de Vacad. des sci. 
de Paris (7 janvier 1929), pp. 140-142. 


| wy?) | 
|! 
| 
{ 


EXTREMAL PROBLEMS. 99 


since R(z) > 0, | argz| < and therefore js hounded for 

R(z) >0. Thus if we choose | K | sufficiently small, | Ke##!¢2/oe\|=1 

and therefore there is an infinity of functions satisfying 1) f(Az) = ef (z) 

and 2) | f|=1 for R(z) >0 where @ is real and A positive (#1). 
Similarly B) may be discussed. The functions 


K + w log z/log x 


where K is constant and R(K) is sufficiently large satisfy all the require- 
ments for f of B). Therefore returning to the equation (3.10), we find that, 
if U* is elliptic or parabolic, there is always an infinity of solutions, and 
therefore, if w* is not unique, w is not unique. Thus we have 


THEOREM 3.2. Let T of Theorem 3.1 be hyperbolic. Then the criterion 
of Denjoy is a necessary and sufficient condition for the uniqueness of w of 
Theorem 3.1. 


4, The principle of the harmonic majorant.’* Julia has stated and 
proved in his “ Principes géométriques d’analyse ” the following principle 
which he terms the “ Principle of the Harmonic Majorant ”: 

“Let f(z) be a function of the complex variable z which satisfies the 
following conditions: 


1) The function f(z) is analytic and regular at every point of a region 
G,. The modulus | f(z)| is single-valued for z C G;. 


2) There exists a function u(z) harmonic and single-valued in Gz such 
that in the neighborhood of the boundary log | f(z)|—u(z) is less than 
every positive number; that is, for each point ¢ of the boundary and for every 
positive e, there exists a circle with center ¢ such that at every point of G, 
interior to this circle the inequality 


log | f(z)| —u(z) 
is satisfied. 
If these conditions are satisfied, log | f(z) |< u(z) at every point of G;. 
If the equality log | f(z)| —w(z) takes place at an interior point of Gz, 
f(z) is of the form 


eulz)+iv(z) 


where v(z) is a conjugate function of u(z).” 
Let G- be a doubly-connected region whose boundary consists of two 
disjoint continua. (We recall that a continuum is a closed set, not a single 


1G. Julia, l. c. 


| 


100 MAURICE H. HEINS. 


point, which is well-chained. The degenerate cases may be dismissed as 
trivial.) Further let us require that f(z) be uniform for zC G, as well as 
that it satisfy the hypotheses of the Principle of the Harmonic Majorant. 
If e+” is single-valued for zC G@:, then the inequality 


(4.1) log | f(z) | S u(z) 


is a “best possible” inequality and e“*’’ is an extremal function for the 
class of functions f(z) which we are considering. 

In general, e“*i”, which we shall denote by $(z), is not single-valued and 
the inequality (4.1) is strong. If we continue an element of $(z) from a 
given point of G- along a closed path in G- containing in its interior one and 
only one of the continua which constitute the boundary of G, back to the 
same given point, ¢(z) changes to e%(z), which transformation we shall 


denote symbolically by 
e%p(z) (0<0< 2m). 


If f(z) is analytic and single-valued for zC Gz and satisfies (4.1), it follows 
that f/¢ is analytic and has a single-valued modulus for zC G,. Furthermore 
| f/p | =1, and If is analytic and has a single-valued 
modulus for zC and in addition 1) |y|=1, 2) poe then f=yo 
is analytic and single-valued for zC G; and |f|<|@|. Thus if we wish 
to calculate 1. u. b. | for C we may consider the equivalent 
problem of calculating 1. u.b. | ¥(zo)| for z.C G@, and for the class of func- 
tions {~(z)} as defined above. It is clear from the definition of y that 
lub. | ¥(2)| <1 (for 240 (mod 27), which we shall denote by p, is attained 
by some function y*(z) which belongs to the family {y(z)}. This is an 
immediate consequence of the theory of normal families.’* However we can 
go further. We shall show that » is the limit of a monotonic non-increasing 
sequence which will be defined below and shall exhibit the totality of extremal 
functions which correspond to the bound p. 

To this end we shall introduce Poincaré’s uniformisation function.* 
Let denote the universal covering surface of Gz and let z=2z(x),|a| <1 
denote the mapping function which maps the interior of the unit circle | ¢|=1 
one to one and conformally onto such that z2(0) =0 and 2’(0) > 09. 
It is well-known that z(2) is automorphic under a cyclic group of trans- 


14P, Montel, Legons sur les familles normales (Paris, 1927), p. 21. 
15H. Poincaré, l. c. For the properties of the mapping function z(#) see G. Julia, 
Legons sur la représentation conforme des aires multiplement connexes (Paris, 1934), 


Chap. 2 and 3. 


f 
t 
( 
f 
| 
3 
il 
t] 
n 
t 
tl 
0 
1 
m 
m 
w] 
g 
fu 
F 
pl 
by 
T 
Si 
fu 
to 
fr 
pe 
W 
2. 
| 


EXTREMAL PROBLEMS. 101 


formations {7*} where 7 is a hyperbolic transformation which corresponds 
to the passing from a given point ¢ on a given sheet of G:© to that point of 
G. which has the same geometric position as £ and lies on the sheet which 
follows (or precedes) the sheet containing £. 

Now consider the class of functions w(2) defined for |r| <1 by 
w(x) =y(z(xz)). It is clear that w(x) is analytic and single-valued for 
<1 and 1) |w(a)|S1 for |x| <1, 2) w(T) =e *w(z), and 
3) w(0) =y(z). Conversely, if x(z) denotes the determination of the 
inverse function of z(x), such that x(z) —0, and if w(z) is analytic for 
ia | <1 and satisfies 1) |w|<1 and 2) w(T) =e w(z), then it follows 
that = w(x(z)) is defined and analytic for z C has a single valued 
modulus there, and satisfies 1) |y|S1, 2) pep, and 3) | ¥(z0)| 
= |w(0)|. Thus there is a complete one-to-one correspondence between the 
two classes of functions {y(z) } and {w(a)} and l.u.b. | | =Lu.b. | w(0) | 
by virtue of 3) and the extremal functions of one class corresponds to 
the extremal functions of the other class under the conformal representation 
of G:™© on | a | <1. Therefore we shall consider in place of our original 
problem for y the equivalent problem of determining 1. u.b. | w(0){ where 
w(x) is analytic for | x | <1 and satisfies the two characteristic conditions 
1) |w| and 2) w(T) =e w(z). 

tecalling that » = l.u.b.|w(0)|, we see that 0 p< 1, if 640 (mod 
For one can construct simple examples of functions satisfying all the require- 
ments of w(x) which do not vanish for e=0. Without loss of generality we 


v 


may assume that the extremal value p is attained by a function w(a) for 
which w(0) > 0, since the relation =e is linear and homo- 
geneous. Let yp, denote the largest positive number such that there exists a 
function w,(2) C € for which w,(0) = py, (To) =e wi (110) = ey. 
For this value of »,,1) of Theorem 2.1 occurs when the notation is appro- 
priately modified. For if 2) of Theorem 2.1 occurred, p, could be replaced 
by a larger number since all the inequalities in 2) of Theorem 2.1 are strong. 
Thus p, can be calculated directly from the relations 1) of Theorem 2.1. 
Similarly, let ». denote the largest positive number such that there exists a 
function C € for which w.(7"0) = =0,+1,+2). Here, 
too, 1) of Theorem 2.1 occurs, and therefore, py: can be calculated algebraically 
from 1) of Theorem 2.1 and p; = pe. In general, let p,» denote the largest 
positive number such that there exists a function wnz(a) C€ for which 
w(T*0) =e* (k =0,+1,+2,: Once again 1) of Theorem 


2.1 occurs and p, can be calculated algebraically from the relations 1) of 
Theorem 2.1. The sequence {y,} is monotonic non-increasing and converges 


to a positive lower bound p*. Since » =< pn for all n, it follows that pS p%*. 


102 MAURICE H. HEINS. 


We shall prove that »* =p and therefore conclude that »= lim pr. Let 


Wn(z) denote the unique function of class € for which 
Wn( T*0) = e* 1,2 +, 


It is clear that the sequence of functions {wn} forms a normal family 
and that any limit function w*(x) of the family belongs to class € and 
satisfies the interpolation conditions w(7")) = e*y* (k=0, +1, 
Let us recall that, as a consequence of Theorem 3.1, the existence of 
w* C€ which satisfies the interpolation conditions w*(7T*0 = e*%y*, 
(k =0,+1,+2,---) implies the existence of a function wC € which not 
only satisfies the interpolation conditions of w* but also the relation 
w(T) =e w(x). Therefore and our assertion that »= lim p, 


is established. It is now easy to determine the totality of extremal functions 
for which | w(0)| yp. They are given by the formulas (3.8) and (3. 10) 
where the notation is suitably modified. Thus returning to the class of 
functions {~} which we have considered, we conclude 


THEOREM 4.1. For the function w(z) defined above, we have 
lou. b. | (Zo) | = lim pa. The totality of extremal functions w*(z) for 


which | p*(z)| = lim pn—p is gwen by the totality of functions w(z) 


for which | w(0)| =p. | f(%)| where equality is attained for 
the functions y* and only such functions. 


5. The principle of hyperbolic measure.’° We shall now enunciate the 
so-called “ Principle of Hyperbolic Measure” and show how the methods 
which we have developed can be applied to study the extremal problems 
associated with this principle. 

Let G, and G~ be two regions which have each at least three boundary points 
and let f(z) be an analytic function which can be continued throughout (@, such 
that its functional values lie in G. Let ¢(~) map conformally the universal 
covering surface of on | t| <1 and map the universal covering 
surface of on <1. We form the function ¢[f(z(x) )] =¢(2) 
which can be extended throughout | « | < 1 and which takes on values interior 
to the circle |¢|—1. We denote the hyperbolic lengths ** of the four linear 
elements dz, dz, dw, dt by dox, doz, dow, dot respectively so that in accordance 
with the invariance of these lengths under the transformations 2—z and 


1° Cf. note 4. 
17R, Nevanlinna, Eindeutige Analytische Funktionen (Berlin, 1936), Chap. 1. 


fe 
Se 


Mc 


q 
( 
| 


EXTREMAL PROBLEMS. 103 


w—>t, we have doz = doz and dow = do:. We conclude from Pick’s Theorem 18 
that dor = do, and therefore that do~<doz. This is the Principle of 
Hyperbolic Measure. 

If Gz is not simply-connected, and if we consider functions f(z) which 
are single-valued for z C then, in general, the inequality do» de: is to 
be replaced by the strong inequality dow < doz and 1. u. b. dow,/doz, < 1. 

Let us suppose henceforth that Gz is a doubly-connected region the 
boundary of which consists of two disjoint continua. We shall consider 
functions which are analytic (save for possible poles) and single-valued for 
2C G, such that w=f(z) C G» where Gy» is any region the boundary of 
which contains at least three points and f(z) = wo where 2 is a given point 
of Gz and w» is a given point of Gy. Our problem is to determine effectively 
b. dow, /doz,. 

As in the general statement of the Principle of Hyperbolic Measure, let 
z(z) map || <1 onto G,™ one to one and conformally such that z2(0) = 2, 
and 2’(0) > 0, and let w(t) map | t| <1 onto Gy if Gw is simply-connected, 
or if Gw is multiply-connected, onto Gw* one to one and conformally such 
that w(0) = and w’(0) >0. Then the function =¢t[f(z(z))], 
<1, where that determination of is chosen for which = 0, 
as it has been defined above, has the properties ¢(0) =0 and |¢| |<] 
for |2| <1 by Schwarz’s Lemma. Furthermore we know that z(x) is auto- 
morphic under a cyclic group of hyperbolic transformations {7} generated 
from the hyperbolic transformation 7. If Gw is simply-connected, the inverse 
function of w(t) is single-valued; on the other hand, if Gw is multiply- 
connected, w(¢) is automorphic under a denumerable group of transformations 
Gw[U,, U2,-++] which are either hyperbolic or parabolic, and therefore t(w), 
any determination of the inverse of w(t), is a linear polymorphic function 
which has the law of transformation 


t(w) > ] 


when we continue ¢(w) along a path in G,,~ from a given point on (w™ to 
any point which has the same geometric position as the given point. With this 
fact in mind, let us study the possible functional relations which ¢(7) may 
satisfy when x is replaced by Tx. It is evident that z(7’) = 2z(z) ; therefore 
f(z(7’)) and f(z(x)) have the same geometric position on Gw® and therefore 


t[f = 
18. Pick, “ Ueber eine Eigenschaft der konforme Abbildung kreisformiger Bereiche,” 
Mathematische Annalen, vol. 77 (1916), pp. 1-6. 


‘ 


104 MAURICE H. HEINS. 


where U;, is some substitution of the group Gw. Hence ¢(z) satisfies a func- 
tional relation of the form 


where U;C G,. But not all substitutions U;.C Gw are candidates. For if U;, 
is to be a candidate, we must have ¢(z) = Ux" [¢(Tx)] and we know from 
Schwarz’s Lemma that |¢|= for <1; therefore | | 
for |x| <1. Setting c= we find 


| | <1. 


Now there are only a finite number of substitutions U; of the group Gw for 
which this is true.4® Furthermore it is conceivable that there need not exist 
a function ¢ analytic for | c| <1 which vanishes at c= 0 and satisfies the 
functional relation ¢(7') = U;.[¢(1) |. (Let us remark that apart from the 
identical transformation the UV; are all hyperbolic or parabolic and therefore 
have fixed points on the unit circle.) By means of Theorem 3.1 we may 
eliminate those substitutions U; which must be excluded, a fortiori, from our 
discussion. Let U,,---,U,, denote those substitutions of G,, finite in 
number, such that ¢ as it has been constructed, satisfies one (and only one) 
of the relations 


$(T) = o(*) | (4 = -,m). 


Conversely, let ¢C € and furthermore satisfy 1) (0) =9, 2) ¢(T) 
= U;|¢(x) | where U;, is one of the allowed substitutions. Then the function 
f(z) =w[¢(ax(z) )] where the determination of is so made that =9, 
is defined and analytic throughout G:. Furthermore f(z) is single-valued. 
For, as x(z) is continued along a path of G.~ from a given point of G.* to 
a point which has the same geometric position as the given point, x(z) is 
transformed into Tx"(z) and $(2(z)) is transformed into Ux"[¢(x(z))|- 
But w(t) is automorphic under the group of substitutions G,, and therefore 


w[ Ux" ((2(z)))] = wl J. 


Therefore f(z) is single-valued. It is evident that w=f(z) C Gw and 
f(Z0) = wo. Thus the study of 1. u. b. dow,/doz, and the associated extremal 
dot 

doz | 
extremal functions where t—¢(«), |v| <1 satisfies 1) ¢(0) and 
2) (1) =Ui[¢(x)] where Ux is one of the substitutions U,,---+,Um 


and the associated 


functions is equivalent to the study of l.u.b. 


19 This is a consequence of the nature of the transformation U;. See G. Julia, l.¢. 


in note 15. 


Pen 


( 

B 
al 
tl 
tl 

It 

= 


EXTREMAL PROBLEMS. 105 


since doy dow and doz But let us note that since $(0) =0, 


do, =| da| and do; =| dt| and therefore 
dot 
lu. b. l.u. b. | 6’(0) |. 
From this it follows that 1. u. b. = = max p where p™ =1.u.b. | ¢’(0)| 
0 
and satisfies the following conditions 1) &, 2) =0, 3) 
= (2) ]. 
Let us determine pp. We have shown in Section 3 that, if @ satisfies 
the conditions 1), 2), 3), it may be expressed in the form 
~ — Q(x) dx (x) 
5.1 1) = 
-S(x) u(x) 
where du C € and satisfies a functional relation of the form 
(5. 2) pau (T’) U* [pa (x) | (U*;, linear) 


and where P,(,S have the significance attributed to them in Section 3 
with an appropriate modification of the notation. It is established in the 
Pick-Nevanlinna theory*® that S(0) and since we have 
P(0) =Q(0) =0. Therefore ¢’(0) = P’(0) —Q’(0)dx(0) where go CE 
satisfies the relation (5.2). We have 


(5.3) l.u.b. | ’(0)| =L.u. b. | P?(0) — Q’(0) da (0) |. 


We are now in a position to apply the methods which we have employed to 
discuss the extremal problem associated with the Principle of the Harmonic 
Majorant. Let = max | P’(0) — Q’(0) da‘! (0)| where du (x) CE; 
go" can be determined directly. Let = max | P’(0) — Q’(0) | 
such that there exists a function dx. («) C € for which 


(0) px ) = U* dx?) (T '()) — 
By 1) of Theorem 2. 1 we are assured of the existence of dx’ (a) since there 
always exists a function ¢d« C € satisfying the relation (5.2). It is clear 
that = p.. In general, let pa = max | P’(0) — Q’(0)m™ | such 
that there exists a function dx C € for which 


(T10) = U* (J 
It is clear that {y»‘*’} is a monotonic non-increasing sequence, such that 


bn” =», Ag a consequence of exactly the same reasoning that we have 


*° J. L. Walsh, Ll. ¢., p. 304. 


| 


106 MAURICE H. HEINS. 


employed in studying the extremal questions associated with the Principle of 


the Harmonic Majorant, we conclude that lim pa” = p™. 


Let »=maxp™., It is now simple to determine the associated extremal 
functions. Let »—|P’(0) —Q’(0)v| with the restriction that |v| <1. 
Now, for those values of v so restricted that there exists a function ¢dx« C € 


for which 
go (T'0) = U* = max p™), 


and only those values there correspond in accordance with Theorem 3.1 the 
associated extremal functions and Theorem 3.1 gives us the totality of such 
functions. By conformal transformation of the independent and dependent 
variables, we find the corresponding extremal functions in our original problem 
which is thus solved. 

Let us remark that this result gives the exact value of the “ Starrheits- 
konstant” 9, of Aumann and Carathéodory*! as well as the associated 
extremal functions for doubly-connected regions. 


6. The analogue of the Pick-Nevanlinna interpolation problem for 
doubly-connected regions. Theorems 3.1 and 3.2 virtually contain the 
solution to the following interpolation problem: 

Let Gz be a doubly-connected region in the z-plane, the boundary of 
which consists of two disjoint continua. What is a necessary and sufficient 
condition that there exist a function w(z) analytic for zC Gz which satisfies 
the following requirements: 1) | w(z)| 1 for zC Gz, 2) w(z) is single- 
valued for zC Gz, 3) =we or k=1,2,---) where 
w;‘ are assigned complex numbers the moduli of which are not greater than 
unity and the z are distinct given points of Gz? 

If such a w exists, when is it unique? 

If w is not unique, what is the totality of functions which satisfy the 
requirements 1), 2), 3)? 

It is clear from our discussion that this problem is equivalent to the 
problem of Theorem 3.1 where U is the identical transformation and T is 
hyperbolic. Theorems 3.1 and 3.2 furnish the solution of this equivalent 
problem and therefore of the problem which we have just posed. 


HARVARD UNIVERSITY. 


21G. Aumann and C. Carathéodory, lL. c. 


| f 

vo 
de 
vo 


RAMANUJAN SUMS AND ALMOST PERIODIC FUNCTIONS.* 


By M. Kac,? E. R. van KAMPEN and AUREL WINTNER. 


Introduction. Several classical formal trigonometrical expansions of the 
analytic theory of numbers have recently been shown ? to be periodic or almost 
periodic Fourier series of the functions which they represent. The object of 
the present paper is to prove a corresponding result for a class of multiplicative 
arithmetical sequences. 

In particular, it will be shown that, for the functions to be considered, 
the celebrated formal trigonometric sums of Ramanujan * are almost periodic 
Fourier expansions in the sense of Besicovitch. Hence, the Ramanujan coeffi- 
cients will turn out to be Fourier averages which vanish for incommensurable 
values of the frequency parameter, the almost periodic function in question 
being always limit periodic (grenzperiodisch). It should be emphasized that 
the fact that the Ramanujan trigonometrical expansions turn out to be Fourier 
expansions leads without any further device to his explicit formulae, if one 
writes down the Fourier average representations of the coefficients. 

Although the arithmetical functions f(m) will be considered only for 
n==1,2,-- +, one can realize the usual assumption of the Besicovitch theory 
by placing f(—n) =f(n) for n—1,2,--- and f(0) =0 (the multiplica- 
tive character of f then remains preserved). It is understood that the class 
(B) of functions f(n) which are defined for integers may be introduced either 
directly or by considering the step function f(¢) which has the value f(n) for 
nSt<n+1. 


1. By a multiplicative function f is meant a sequence f(n) ; n = 1, 2,3,--- 
for which f(,:n2) =f (m1) f(m2) whenever (m, 2) =1 and f(n) #0 for at 
least one n (so that f(1) 1). Only those multiplicative f(m) will be con- 
sidered for which 


* Received March 31, 1939. 

1Fellow of the Parnas Foundation, Lwéw, Poland. 

2A. Wintner, American Journal of Mathematics, vol. 57 (1935), pp. 534-538; Duke 
Mathematical Journal, vol. 2 (1936), pp. 443-446; American Journal of Mathematics, 
vol. 59 (1937), pp. 629-634; P. Hartman and A. Wintner, Travaux de VInstitut Math. 
de Thilissi, vol. 3 (1938), pp. 113-119; P. Hartman, American Journal of Mathematics, 
vol. 60 (1938), pp. 66-74; A. Wintner, Revista de Ciencias (Lima, 1939) (in press). 

*S. Ramanujan, Collected Papers, Cambridge University Press (1927), pp. 179-199. 


107 


yf 

] 

le 

h 

it 

dd 

yr | 

1€ 

f 

t 

aS 

re 

n 

1€ 

8 

|| 


108 M. KAC, E. R. VAN KAMPEN AND AUREL WINTNER. 


where the p denote prime numbers. An f(n) which satisfies (1) will be called 
strongly multiplicative. A classical instance of (1) is 


(2) f(n) = = Il = II $(p) ; (@ = Euler’s function). 


n pin P pin 


For any f(n) and for any positive integer k, put 


(3) f“(n) =1 or f®(n) =f (px) according as n 40 or n=0 (mod px), 
where px is the k-th prime; and put 


(4) fc (n); so that fx(n) =I f(p), where pS pm 
j=1 p\n 


According to (3), the function f(n) of nm has the period px and possesses 


the Fourier expansion 


Pk Pk 


m=0 
which is, in fact, nothing but the formula of equidistant trigonometrical 
interpolation. According to (4), the function f,(n) of n has the period 
Pi, = pPip2° * * Perper and possesses, in view of (4) and (5), the Fourier 


expansion 


f(p)—1 
6 > II — 


q>11sm<q 


where c, = II + 


m 
cos (27 — n), 
Y 


2. For a function g —g(n) defined for n = 1, 2,3,---, put 


(7) M{g} —M{g(n)} —lim~ 3g(n), 


noo m=1 
if this limit exists. 
All considerations will be based on the following elementary lemma: 


If a strongly multiplicative function f(n) satisfies the condition 


(8) 
then the mean value M{f} exists and 
(9) u(y (144+), 


I 
th 
| 
(1 
In 
| Sir 
Ail 
(11 
| 


1 


RAMANUJAN SUMS AND ALMOST PERIODIC FUNCTIONS. 109 


In order to prove this, let g(m) denote the multiplicative function which 
is defined as 


g(a) 0 or gx) 
pin 


according as n is not or is quadratfrei. Then, for every positive integer m, 
f(m) = 3 dg(d) ; 
d|m 


n 
hence, f(m) [=] mg(m), and so 
m 


m=1 m=1 


f(m) =n Sg(m) +0(3m|g(m)|). 
m=1 m=1 


Since the definition of g(m) and the assumption (8) obviously imply the 


(absolute) convergence of the series & g(m) to the sum represented by the 


product on the right of (9), it follows that in order to prove (9), it is 


sufficient to show: that 


O(%m|g(m)|) =o(n). 
m=1 


But the last relation is clear from the absolute convergence of the series 


xqg(m); so that the proof is complete. 
m=1 
The proof which we had originally for the above lemma was function- 
theoretical in nature. The above elementary approach was then suggested to 


us by Dr. Paul Erdés. 


3. <A corollary of (8)-(9) is that for a strongly multiplicative f(n) 


one has 


In fact, on writing ——~ for f(p) in (8)-(9), one obtains (10), since 


f(p) 
by Om 
if — 4, then also 
m=1 og n m=1 


Similarly, if f(n)* denotes the A-th power of when either f(n) > 0 or 
an integer, then 


1 00 ] 1 
(11) lim — 3 mf(m)* = 


noo m=1 P 


if <o and A>—1 


— | 
n n n 
| n 
Lit 
P 


110 M. KAC, E. R. VAN KAMPEN AND AUREL WINTNER. 


(and (10) may be thought of as the limiting case A—1). In order to 
prove (11), it is sufficient to replace f(m) in (8)-(9) by the strongly 
multiplicative function f(m)* and then apply the Abelian lemma: 


n n 
if dm~ an, then (1+A) man ~ an’ for every A>—1. 


m=1 m=1 


As an illustration, consider the example (2); so that f(p) =1—p". 
In this case, (10) is applicable and goes over into Landau’s relation 


— p*) (1 — 


while (9) is applicable to any power of ¢(n)/n and gives Schur’s relation 


p p 
for every real / (and, as seen from the proof of (9), for every complex / also). 


4. For every strongly multiplicative, positive f(n), let ft(n), f(") 
denote the strongly multiplicative, positive functions which at an arbitrary 
n =p attain the values 


f*(p) = Max (1, f(p)) and f-(p) = Min (1, f(p)), 
respectively. Then (2) shows that 
(12:) ; (122) O<f(n) SISfi(n); 
while (4) clearly implies that 
(13;) ft(n) (n), f(n) Sfir(n); 
(132) f—fem (fF +f 
Notice that either of the functions f* is uniquely determined by f and k, i.e. 


that (f*)« = (fx)*- 


Using these notations, it will be easy to deduce from (9) the following 
theorem : 


| 
tl 


ing 


RAMANUJAN SUMS AND ALMOST PERIODIC FUNCTIONS. 


Every strongly multiplicative, positive function f(n) which satisfies (8) 
is almost periodic (B) ; furthermore, 


(14) M{| f—fr|}—0, as k— o. 


In fact, it is clear from (7) and (6) that M{fx} = cx. Since cx in (6) 
was defined as the k-th partial product of the infinite product (9), it follows 
that 
(14 bis) = M{ fr} M{f}, as ko. 


Hence, (14) is certainly true if either f(n) =fi(n) or f(n) Sfx(n) for 
every n and &. Jt follows therefore from (13,) that 


(15) M{| f*—fx* |} and M{| —fir |} 30, as kh 
But the function (6) of n is periodic for every f, hence also for f*; so that 
either of the functions f;,* of n is periodic for every k. It follows therefore 
trom (15) that either of the functions f*(n) is almost periodic (B). Since 
(12.) shows that f-(m) is a bounded function, it follows from (12,) that 
f(m) is almost periodic (B). 

In order to prove (14), notice first that, by (13,) and (13.), 


(15 bis) M{| f—fe |} (fe + (ft — fe) fi}. 


The sum M + M on the right of (15 bis) may readily be written in the form 
2M ft} — M{f} —M {fx}. It follows therefore from (14 bis) and (15 bis) 
that in order to prove (14), it is sufficient to show that M{fi- ft} ~ M{f} 
as k—> oo. But this is obvious from (9) and from the definitions of fi 
and f*. 


5. The almost periodicity (B) of f(n), proved in §4, implies that the 
n-average M{f(n) exp 2tAn} exists for every real X. It turns out that this 
Fourier coefficient vanishes for every irrational A; so that f(n) is limtt periodic 
(grenzperiodisch) ; more explicitly, the Fourier series (B) of f(n) ts 


4 4 f(p) 
16 
(16) f(n) ~ + 


cos (27 n), 


where the first (exterior) summation is over all quadratfret q > 1, while, if 
q is fixed, the index p runs through all prime divisors p of q, and m through 
the ¢(q) values which satisfy (m,q) =1 and 


| 
y 

Nn) 
ry 
» Ory 


112 M. KAC, E. R. VAN KAMPEN AND AUREL WINTNER. 


In fact, (16) follows from (14), (14bis) and (6), since P;, in (6) Was 
defined as the product of the first & primes. 

The restriction of the first summation index of (16) to quadratfrei q >1 
may be eliminated in the usual manner, if one introduces the Mobius function 


p(r), where r= 1,2,3,---. In fact, (16) may then clearly be written in 
the form 
1—f(p) 
17 n) ~ M{f} & p(r)e-(n) I 
( ) f( ) {f} p( ) ( ) air f(p) —1+ 


if c¢-(m) is an abbreviation for the finite sum 


m 

(18)  ¢,-(n) cos (24——n), where (m,r) =1 and lim<r. 
m 

Since the ¢(7r) angles which occur in the sum (18) are symmetrically placed, 

the sum which one obtains by writing sin for cos is 0; so that 


(18 bis) =X exp — n), where (m,r) =1 andlSim<r. 

Thus, the c,(n) are precisely the Ramanujan sums,‘ and so the Fourier serves 

(B) of f(n) is identical with Ramanujan’s formal trigonometric series for 

f(n). The coefficients of the series 


(19) f(n) ~ (N) 


are 
(20) a, =ar(f) = M{f}u(r) (r= 1,2,3,-- -), 


by (17); while the expansion functions (18) of (19) may be expressed ® in 
terms of the Kuler ¢-function and the Mobius p»-function as follows: 


(21) where t = (m,r). 


6. According to (16), the frequencies (Fourier exponents) of the almost 
periodic function f(n) are rational numbers between 0 and 1 (or, rather, 
between —1 and 1). Let the terms of the Fourier series (16) be ordered 
in the Ramanujan fashion (17)-(18), and suppose that each of them actually 


4S. Ramanujan, loc. cit.?, pp. 180-181. 
50. Hélder, Lichtenstein Memorial Volume, Prace Matematyczno-Fizyczne, vol. 43 


(1936), pp. 13-23. 


4 


43 


RAMANUJAN SUMS AND ALMOST PERIODIC FUNCTIONS. 113 


occurs, 1.e., that none of the coefficients (20) of (19) vanishes. Then the 
frequencies of f(m) are uniformly distributed on the interval [0,1] (or rather, 
{[—1,1]). This may be proved as follows: 

Since | u(m)| <1, while ¢(m) > «© as m—> o, Hélder’s formula (21) 
implies an observation of Ramanujan, according to which c,(n) =O(1) 
when either r is fixed and n— ©, or n is fixed and r— oo. In particular, 


(n 
= () for every fixed n2 1. 


(21 bis) lim 


Now, (21 bis) is equivalent to the equidistribution of the frequencies of (19). 
In fact, let S“” denote, for any fixed r= 1, the finite sequence 


(r) m 
my Mes 
(22) Ss”; 
r 


of those fractions m/r whose numerator m satisfies the conditions (m,17) = 1 
and m<r. And let 0S denote the distribution function 
of the ¢(r) fractions contained in S“. Then it is clear from (18 bis) that 
the ratio occurring on the left of (21 bis) is the n-th Fourier-Stieltjes 
coefficient of p(x), i.e., that 


(22 bis) f exp 2xinadp,(x) = 

Thus, it is clear from the criterion of Weyl for equidistribution (mod 1), that 
the content of (21 bis) may be expressed as follows: The ordered infinite 
sequence of fractions which is obtained by writing r=1,2,--- in (22) is 
uniformly distributed on the interval [0,1]. This fact, which is equivalent 
to a result of Polya,® may be obtained without the Fourier analysis (22 bis) 
of the sequence (22) also, and contains the corresponding fact concerning the 
ordered infinite sequence Farey sections.’ 


7. The considerations of §4 and §5 may be modified in such a way as 
to lead to (B*) instead of to (B). To this end, one merely has to replace 
the condition (8) by the pair of conditions 


p 


<i: 


°Cf. G. Pélya and G. Szegi, Aufgaben und Lehrsitze aus der Analysis, chap. II, 
no. 188. 
* Cf. loc. cit.*, chap. IT, no. 189. 


8 


1 
d 
| 
ly 
| 


114 M. KAC, E. R, VAN KAMPEN AND AUREL WINTNER. 

In fact, a strongly multiplicative (real) function which satisfies (23) is almost 
periodic (B*) and has the Fourier expansion (16) or (19); furthermore, 
(24) M{(f —fx)?} ~0 as ko 


and the Parseval relation takes the form 
~ 

(25) = o(r)ar’ 
r=1 


In fact, if (23) is satisfied, then (4) shows that (9) is applicable to any 
of the three functions f(n)*, fr(n)*, f(r) fe(m). Thus, the three averages 
M{f?}. M{ ff} exist and have the respective values 


p pS: 


pS: P>D P 


Hence, M{f(n)*} + — 2M{f(n)fi(n)} as ka This 
proves (24). Since fx(m) is, by $1, a periodic function of n, it follows from 
(24) that f(7) is almost periodic (B*). Finally, (25) is clear from (17), 
since (19) and (18) show that every amplitude (20) occurs in (17) exactly 
times. 

As an illustration, consider the example (2). Then f(p) =1—>p"; 
so that (23) is satisfied, and (20) shows that the coefficients (19) are 


(26) = M{f}u(r) (p?— (f(n) = 4(n)/n). 
pir 


THE JOHNS HOPKINS UNIVERSITY. 


| 
a 


AN ASYMPTOTIC FORMULA FOR EXPONENTIAL INTEGRALS.* + 


By Puitie HARTMAN. 


It is known’ that if f(x) is a function possessing a continuous second 
derivative in the interval 0 = = 1, then 


(1) (ila) dz = exp(mi/2a)0(1 + a) + O(t-*/2), 


ast—+-+ o for all a=2. Professor Wintner pointed out to me the problem 
suggested by the relation (1), namely, to determine conditions for the asymp- 
totic formula (1) which are less restrictive than the assumption that f(z) 
have a continuous second derivative, and to replace, at the same time, (1) by 


a1 
(2) g(x) exp(— sa*) dr ~ 


where s = o + it is a complex variable. The object of this paper is to provide 
the answer to these questions. 

It is clear that a necessary condition for (2) is that o, the real part of s, 
should be non-negative. The case o 0 requires more stringent conditions 
than the case o > 0. For this reason, the main results are stated in two 
theorems. 


THEOREM 1. /f 
(i) g(x) is of bounded variation in 0 Sx =1, and 
(ii) 8>1, 
then 


(3) g(x) exp(— dx = T (8) g(+ + 0(| s |"), 


uniformly as | s|—> o in the half-plane | args | S 7/2. 


* Received December 14, 1938. 

+ Presented to the Society, February 25, 1939. 

*The case a =2 was treated by O. Perron, “Uber das infinitire Verhalten der 
Koeffizienten einer gewissen Potenzreihe,” Archiv der Mathematik und Physik, Series 
ITI, vol. 22 (1914), pp. 329-340. The formula (1) was proved for a22 by A. Wintner, 
“On the asymptotic formulae of Riemann and of Laplace,” Proceedings of the National 
Academy of Sciences, vol. 20 (1934), pp. 57-62. 

115 


116 PHILIP HARTMAN. 


THEOREM 2. If 
(i) the integral? of g(x) over OZ ¢=1 exists, 
(11) g(+ 0) =lim exists, and 
(ili) 1, 
then (3) holds uniformly as | s|—> % in the angular region | args | S 1/2 — 
where « > 0 is arbitrary. If condition (i) holds, but (ii), (iii) are replaced . 
(ii’) | g(x) —g(+9)| | log a |? 0, and 
(iii’) 0, 
then (3) holds uniformly as |s|—> © in the angular region | args | S 7/2 —«. 
It has been recognized * that in the Laplace case, i.e. s real, 1(1 + &*) 
X g(+ 0)s- is only the first term of an extended asymptotic formula for the 
integral in (3) if g(x) possesses a sufficient number of derivatives. Actually, 
the same is true if s is not real. However, in the Riemann case, i.e. s purely 
imaginary, the remainder term cannot be better than O(¢-'). Furthermore. 
the condition that g(z) possess a number of derivatives may be replaced by a 
much weaker condition. In this direction, we have the following corollaries: 
CoroLiary 1. Let n=1 be an arbitrary integer; Bx, cx, k =1,- - -,n, 
arbitrary constants such that 


(i) OS Bi < Bn 
(ii) f(z) = h(x) where h(x) is of bounded variation in 
k=1 
h(+ 0) —0, and 


(iii) ¢>1+4+ B,, 
then 


holds uniformly as | s|—> « in the half-plane | args | S 2/2. 


CoroLuary 2. Let n=1 be an arbitrary integer; Br, cx, k =1,° 0, 
arbitrary constants such that 


2 In this paper, an intergral over a finite interval is to be considered as an ordinary 
Lebesgue integral. An integral over an infinite interval is to be interpreted as an 
improper Riemann integral. 

*Cf. O. Perron, “Uber die niherungsweise Berechnung von Funktionen grosser 
Zahlen, “ Miinchner Sitzungsberichte, (1917), pp. 191-220; A. Haar, “ Uber Asympto- 
tische Entwicklungen von Funktionen,” Mathematische Annalen, vol. 96 (1926), pp. 
69-107; A. Wintner, “ Untersuchungen iiber Funktionen grosser Zahlen” Mathematische 
Zeitschrift, vol. 28 (1928), pp. 416-429; A. Wintner, loc. cit. 1. 


4 
| | 
( 
0 
( 


Le 


AN ASYMPTOTIC FORMULA FOR EXPONENTIAL INTEGRALS. 117 


(i) OS Bi < < Bn 


(ii) f(z) = + where the integral of h(x) over 


exists, 


~ 


(iii) h(x) | log x 0, x0, and 

(iv) 
then (4) holds uniformly as | s|—> c in the angular region | args |S 2/2—e. 

These corollaries reduce to the corresponding theorem if n 1, B, = 0, 
¢,=f(+ 0). It will be clear from the proof that a corollary analogous to the 
first part of Theorem 2 is true, i.e. if one replaces condition (iii), (iv) of 
Corollary 2 by 


(iil’) 0, 0. and 
(iv’) «>1-+ Ba. 


It may be noted that Corollary 1 for n = 2, B: =0, B2=—1 is a slight 
improvement over (1) without any assumption as to the differentiability of 
f(z). Also, if one does assume that f(z) possesses m (> 0) continuous deriva- 
tives, an application of Corollary 2 gives an asymptotic formula with (m + 1) 
terms, while earlier results * in the Laplace case give a formula with only m 
terms. 

First, the corollaries will be proved. By changing the integration variable 
from x to B= 0, 


a1 1 
(5) exp(— sx*)dz = (1+ f, exp[— dx. 
0 0 


Thus, one must consider integrals of the type 


f exp (— sz7) dz, y=a/(1+ £B). 
Now, 


1 
(6) exp(— sz’) dz = f, exp(— dz — j exp (— sa7) da, 
0 0 


e 


where the two integrals on the right of (6) exist if either 


(7) |args| 7/2 and y > 1 
or 
(8) | args | S/2—e and y>0. 


*Cf. A. Wintner, loc. cit. 1. 


n 
| 

n 
)- 


118 PHILIP HARTMAN. 


In either case ° 

(9) y*)s = y T(y*)s. 
In case (7), one has 

(10) | |= 4y72 | 


The appraisal (10) is obtained by changing the integration variable from z to 
z/Y and applying the second mean value theorem to the resulting integral 


(over a finite interval) 
b 
exp(— sx) dx 
b 
exp(—sr)de exp(— sx) dz. 
1 


(It is understood that the second mean value theorem is applied separately 
to the real and imaginary parts of the integral and that, in the above formula 
and in the sequel, the following notation is used 


whenever € is a limit of integration.) By integrating, it is seen that the 
absolute value of each of the integrals on the right is less than 2s |~'. The 
inequality (10) now follows by letting b+ o. 


On the other hand,® in case (8) 


oO 
(12) | exp(— dx | Cy |s|-¥, 

1 
where N > 0 is arbitrary and Cy depends only on N and y. To prove (12), 
note that 


oo 
| exp (— sav) dx |S exp(— o27) dz, 
1 


where s=o-+ il. By changing the integration variable from x to o'/x, the 


last integral becomes 


(13) f exp(— 27) da = f exp (— exp(— 27 +- de, 


5 Put s=rexp(i6é); then 
f exp (— saV) dx = 1-1/7 if exp[— a? exp (i@) 


It can be shown by a straightforward application of Cauchy’s integral theorem that in 
both cases (7) and (8) 


exp[— exp (i) ]da = exp (— 10/7) exp (— «7) dx, 
0 0 


while the last integral is + 7-1). 
®This appraisal is given by A. Wintner, loc. cit. 1. 


] 
( 
it 
( 
SO 
L 
Wi 
(1 
Fo 
une 
adc 
i spec 
| 
| The 
| for ; 


AN ASYMPTOTIC FORMULA FOR EXPONENTIAL INTEGRALS. 119 
where the limits of integration are o/¥ and + o. It follows from (13), that 


ie, ao 
| exp(— sz’) dx | < exp(— o'””) J exp(— 27 + 27/*) dz, 
0 


from which one obtains (11). 
Now, (5), (6), (9), (10), (12) imply 


a1 n 
0 k1 
=> [ (1 + Bu) ] 4 (| 


uniformly as | s | — oo if either 
|args|S7/2 and«>1+ 
or 
| args | Sw/2—e and 2>0. 
Thus, to complete the proof of the corollaries, it must be shown that 
1 
h(x) a8 exp(— sa*) dx = s |-'+8)/4), 
Jo 


By changing the integration variable from x to 2'/*), this becomes 


al 
J exp (— dx = o(| s |-*8)/4), 
0 


On placing 
g(x) 
it is seen that the Theorems 1 and 2 must be proved in the case g(-+ 0) = 0. 
Define the non-increasing function m(|s/|) = m(s) as follows 
(15) m(s) =1l.u.b.| g(z)| for O0< |", 
so that 


m(s) > 0, |s|— 


Let ¢(s) = ¢(|s]|) be a non-decreasing function of | s | which approaches % 


with | s | so slowly that 


(16) m[| s 9, |S|—> 
For example, one may let 
min|[ | 8 m (| : 
under the last set of conditions of Theorem 2, it may be supposed that? in 
addition to (16) 
‘It may be supposed that 126 > 0, otherwise the second part of Theorem 2 is a 
special case of the first part. In this case, let 
| | 
The function @(s) may be defined to be 
min[y (| s |1/26-7) -1/2 log1/é | s|, 8 |1/26 log1/6 | s 


for a small constant 7 > 0. 


| 
| 


PHILIP HARTMAN. 


(17) $(s)*/log | s | |s|—> oo. 
Now, 


1 1 
(18) f, g(x) exp(— sx*) dr = g 4 exp(— sx) dz. 
0 0 


Consider the last integral from 0 to b, 0 < b=1, 


b b 
| exp(— sx) dz | = m(b-*) f. 2-5) 
0 
= m(b1/*) 
Thus, if one places 
b= | 
it is seen from (16), that 


b 
(20) 20-9? exp(— sx) de = 0(| |”), 


uniformly as | s | > 0 in the half-plane | args | S/2. In order to appraise 
the integral on the right of (18) from b to 1, apply the second mean value 
theorem (to the monotone function #~°)/*), one obtains 


1 
(21) exp(— sx) dx +- exp(— sx) dz, 
b £ 
where, cf. (11), €=€(s) = (&, &) satisfies 
(22) b<&<1, b<&<1. 


The treatment of the integrals in (21) is essentially different for Theorem 
1 and Theorem 2. Under the conditions of the first theorem, g(x'/°) is of 
bounded variation and is, therefore, the difference of two non-decreasing func- 
tions. Thus, it may be supposed without loss of generality that g(x”) is a 
bounded monotone function, so that the second mean value theorem may be 
applied to each of the integrals in (21). It follows that 


1 
(23) if g exp(— sx) dz | S Mb“) 8 | s + 8M |s 
b 


where M =1.u.b.| g(x)| for 0S «2=1; so that by (19), and the fact that 
$(s) > « and (1—8) < 0, the integral in (23) is 0(| s |-’”) uniformly as 
{s|—> oo in the half-plane | args|<-/2. This completes the proof of 
Theorem 1 and Corollary 1. 

Put s =r exp(i@) ; in the angular region | args | = | 6 | = 1/2 —e, one 
has cos 6 = c > 0, for some constant c—c,. Then for any0<p<qil 


(24) | f | = "| exp(—era)de. 


120 

( 
a 
( 
(2 
the 
(2 
righ 
the 


AN ASYMPTOTIC FORMULA FOR EXPONENTIAL INTEGRALS. 121 


Denote by S the set of points z in 0 = 41 such that | g(a’ | > 1; and by 
Tq the set of points in the interval, p= xq which are not in S. Thus, 
if z is in then | g(a'/*)| = 1; if is in S, then > for some constant 
0 since g(x) > 0, Therefore, by the first mean value theorem, 


(25) | exp(— crx) dr exp(— crx) dx < (cr)! exp(—crp) ; 
also 
(26) | g(x) | exp(— crx) dx = J exp(— crn), 
where 
J -{ | g(a) | da. 
Combining (24), (25), (26), ‘ 
(27) | | < J exp(— erm) + exp(—crp). 


Thus, (19), (21), (22), (27) imply 


(28) | exp(— sx) dx | < 2J[1 + 1° 9-95 exp(— crn) 
exp[— + exp[— op (s)*]. 
The first term on the right of (28) is clearly 0(77/) = o0(|s |“). Since 
exp[— c(s)*] > 0, |s|—> o, 

the second term is =o0(|s while the last term is o(17') 
=o0(|s if 8 >1. Thus, the first part of Theorem 2 follows from (18), 
(20) and (28). 

Under the conditions of the last part of Theorem 2, the last term on the 
right of (28) is also o(7r-/°) even for 128>0. For 

 exp[— cb(s)*] = exp{ (— log r) [e(s)®/log r— (1 —8)/8]}, 


and the factor of 7~'/° is 0(1) as r— o in virtue of (17). This completes 
the proof of Theorem 2 and Corollary 2. 


QUEENS COLLEGE, 
FLUSHING, NEW YorK. 


4 


ALMOST PERIODICITY AND THE REPRESENTATION OF 
INTEGERS AS SUMS OF SQUARES.* 


By M. Kac.** 


Let r,(n) be the number of different representations of n as a sum of 
s squares. Then 


n=1 nol 


Hardy has given an analysis of the arithmetical properties of rs(”) based on 
the theory of elliptic d-functions. The object of this note is to point out the 
close connection between the investigations of Hardy * and the theory of almost 
periodic sequences. In particular the “singular series” of Hardy and Little- 
wood turns out to be a formal Fourier expansion of n'-#r,(n). The main 
result of Hardy that for 5=s = 8 the sum of the “ singular series” is pre- 
cisely n1-4*r,(n) will be shown to be equivalent to the statement that n’=*r,(n), 
where 5 =s 8, is a uniformly almost periodic sequence. The case s=? 
is of particular interest, since r.(m) is? not even an almost periodic function 
of class (B), although the Fourier coefficients exist and tend to %. The in- 
vestigations on almost periodicity of functions occurring in the analytic number 
theory and given by formal trigonometrical series, as originated by Wintner,’ 
have led, thus far, to functions which are almost periodic of the class (2°), 
at least. This may emphasize the interest of the situation mentioned above. 


* Received May 11, 1939. 

** Fellow of the Parnas Foundation, Lwéw, Poland. 

1G. H. Hardy, “On the representation of a number as the sum of any number of 
squares, and in particular of five,” Transactions of the American Mathematical Society, 
vol. 21 (1920), pp. 255-284. Cf. also S. Ramanujan, “On certain arithmetical fune- 
tions,” Collected papers of Srinivasa Ramanujan, Cambridge (1927), pp. 136-162. 

* A. S. Besicovitch, Almost Periodic Functions, Cambridge, 1932. In particular 
pp. 91-109. 

’ A. Wintner, “On the asymptotic distribution of the remainder term of the prime- 
number theorem,” American Journal of Mathematics, vol. 57 (1935), pp. 534-548; “ The 
asymptotic behavior of the function 1/¢(1 + it),” Duke Mathematical Journal, vol. 2 
(1936), pp. 443-446. Cf. also M. Kac, E.R. van Kampen and A. Wintner, “ Ramanujans 
sums and almost periodic functions,” American Journal of Mathematics, this number, 


pp. 107-114. 
122 


a 


wh 


and 


Th 


Mak 
Taub 


canne 
the f; 


les m 
vol, ] 


| 

a 

A 

pe 

an 

an 

(2 

. 


ALMOST PERIODICITY AND SUMS OF SQUARES. 123 


1. If f(n) is a function defined for n = 0,1, 2,- + - and A a real number, 


the averages M{f(n)exp(2ziAn) }, where 


M{g(n)} = lim 3 g(j), 


are called the Fourier coefficients of f(n) if these averages exist. If these 
averages exist and if they vanish except for an at most enumerable set of 


A-values, say for A= the series 


exp(— 2ridyn), where = }, 
k 


will be called the Fourier series of f(n), also when f(n) is not almost 
periodic (B). 
It will be shown that the Fourier coefficients of f(n) = n'-#¥r,(n) exist 


€ 


and that the Fourier series of n’**r,(n) is the “ singular series” of Hardy 


and Littlewood, namely 


foe) 
(2) ps(n) = *exp(— 2athn/k), 
P (38) (hin) =1 


. y 
where Si: = 


(2mihj?/k). 


J 


Suppose first that A is irrational. Then 


n-1 
lim exp(2m1dAj?) = 0, 
n—>OO j 0 


and so 


OO 
(1— q)?(1 + 23 exp(2midj?)) 0, as q>1—0. 
1 


Thus, (1) implies that 


x 
(1—q)#(1+ 37,(j) exp(2nidj)) as 


1 


Making use of a well known Tauberian theorem,’ one obtains 


‘Cf. for instance J. Karamata, “ Neuer Beweis und Verallgemeinerung einiger 
Tauberian Siitze,” Mathematische Zeitschrift, vol. 33 (1931), pp. 294-299. The theorem 
cannot be applied directly since the coefficients are not positive. One can make use of 


the fact that n-3s Sr.(j) > is/0 (4s + 1) and then to apply the theorem to the series 
1 
(1 + cos and r.(n) (1+ sin 2mdn)q". See also J. Karamata, “ Sur 
les moyenne arithmetique des coefficients d’une série de Taylor,” Mathematica (Cluj), 
vol. 1 (1929), pp. 99-106. 


n 
nO 
| 


124 M. KAC. 
n-1 
n-4* 1r,(j)exp(2ridj) > 0, as n> 
0 


It follows now immediately that, for irrational A, 
M {n'-#*r,(n) exp (2mridn) } = 0. 
Next, suppose that A=h/k. Observing, as Hardy does, that 


Sh k 


1 + 23 q?exp(2rihj?/k) ~ (log (1—q)7, as g>1—0, 
1 


one obtains 


co 8 
1 + ~ (S#) "(1 —g), as 
1 


Applying again the Tauberian theorem,‘ one arrives at 
n=1 48 g 8 

hj /k) > as n> 


This obviously implies that 
M {n)-**r,(n) exp (2mihn/k) } 2 + 5 ( ) 
and the proof of the italicized statement is complete. 


2. From the classical results concerning the Gaussian sums Six one 
readily deduces | Sy |S 2k4 and it is clear that for s=5 the “singular 
series” (2) is absolutely convergent. Since it was already proved that the 
“singular series” is the Fourier series of n‘~##r,(n), it follows that 


n-r,(n) = ps(n) 


holds if and only if n'-4¥r,(n) is uniformly almost periodic. The function 
n-4;,(n) is not uniformly almost periodic for s > 8 and it is almost periodic 
for 5<s<8. This is only a restatement of Hardy’s results; it seems to be 
very difficult to prove or disprove elementarily the uniform almost periodicity 
of 


3. In the case s=3 or s=4 the “singular series” is not any more 
absolutely convergent and n'#r,(n) is not uniformly almost periodic. Nevel- 
theless it still has some properties of almost periodicity. In fact, it is almost 


Ji 
of 


it g 
that 
leas’ 
the 

exce 


ro(m 
M{r, 


The 
Dr, J 


and { 


Denot 


B(n) 


0 
d 
It 
Th 
N 
1 
is ¢ 
M 
| 
|| 


ALMOST PERIODICITY AND SUMS OF SQUARES. 125 


periodic (B*). For simplicity, only the case s = 4 will be discussed. | Let 
oj(n) be 1 or 0 according as 7 is or is not a divisor of n, and let y(n) be 


defined by 
(n) | n, 


Jacobi’s well known theorem concerning the representation of 74(n) in terms 
of o(n) may then be written as follows: 
1+ 2we(n) o(n) _ 1 + 2w2(n) w;(n) 


Qy(n)+1__ n _ 


= 8 


N 
It can be easily verified that M{(n-*o(n) — 3 (n))?} 0, as 
j=1 
This proves the almost periodicity (B*?) of n-'o(n), since the finite sums 
N 
jw; (n) are periodic. Observing that the ratio of 1 + 202(n) and 2¥()+!—1 
1 


is also almost periodic (B*), one sees that so is n7"14(n). 
From Jacobi’s theorem one deduces by an elementary computation that 
M{nr?(n) } = 420€(3). On the other hand the Parseval relation gives 


M{n?r2(n)} = 3 | |S where | Sax = 420¢(3). 

k=1 (hyk)=1 (hyn)=1 

4. The Parseval relation is evidently valid in the case 5s <8, and 

it seems to be quite probable that it holds also for s > 8. This would mean 

that if s > 8, then n*-#r,(n) is, though not uniformly almost periodic, at 

least almost periodic (B*). Furthermore it would follow immediately that 

the inequality | n'-*7,() — ps(n)| > holds for “ almost all ” integers (e. g. 
except for a sequence of integers of density 0) whatever is e > 0. 


5. The case s = 2 is the most exceptional, which is due to the fact that 
r(n) is 0 for “ almost all” integers. r.(n) is evidently of class (B'~*), since 
= 0. 

As mentioned in the introduction, 7.(n) is not even almost periodic (B). 
The following simple proof of this statement was communicated to me by 
Dr, E. R. van Kampen. 

Let n = - -, where the p’s are primes = 3 (mod 4) 
and the q’s primes = 1 (mod 4). It is well known that 


Denote the product depending only on f’s by B(n). It is easily seen that 
B(n) is 0 for “almost all” integers and that B(n)r2(n) =7.(n). Suppose 


0, 

r 


126 M. KAC. 


now that r.(n) is almost periodic (B). According to a known theorem * one 
could find a sequence of finite trigonometrical sums (which are in fact certain 
means of the partial sums of the “ singular series”) W;,(n) approaching r.(n) 
in the mean, i.e., M{(r2(n) — Wi.(n)|} ~0 as k— w. Obviously 


M{| r2(n) — B(n) Wi(n) |} = M{B(n) | r2(n) — Wi.(n) |} 
tends to 0 ask —> «. This implies a contradiction, since 


M{r.(n)} =z, M{B(n) | Wi(n)|} = 9, 
| r2(n) — B(n)Wi.(n)| = r2(n) — B(n) | Wi(n)|. 


6. It may be mentioned that the remarks of § 1 allow us to compute the 


limits of the expressions 
728 > (jk) 
j=l 


as i —> » and k is a fixed integer. In fact, let w,(n) have the same meaning 


k-1 


as in $3. Then o,(n) = k* exp(2rihn/k), and so 
h=0 
im 3 = lim = (Sax)? 
) ()) (7) kas (48 a 1) ( hk) 


THE JOHNS HOPKINS UNIVERSITY. 


5 Loc. cit., 2) p. 105 (Theorem 11). 


in 
Fc 
| wl 
we 
me 
hal 
tu 
sin 
in 
| Ho 
no 
pos 
1001 
and 
to t 
inte 
sim 
| thar 
: take 
to ti 
Anna 
the fi 
4 
of the 


ON THE EXPANSIONS OF CERTAIN MODULAR FORMS OF 
POSITIVE DIMENSION.* 


By Herpert S. ZUCKERMAN.’ 


1. A definition of a modular form of positiye dimension has been given 
in a paper by Rademacher and the author.? In that paper we found the 
Fourier expansions of those forms F(7) which belong to the full modular 
group and which have only polar singularities at the parabolic point tr =i 
when measured in the uniformizing variable « = e?™7, In the present paper 
we shall also restrict ourselves to functions which belong to the full group 
and which have only polar singularities at t= 10, but in the definition of a 
modular form we shall omit the restriction that F'(7) be analytic in the upper 
half-plane and shall merely assume that F'(r) has, as singularities in the 
fundamental region,* at most a finite number of poles and possibly a polar 
singularity at v0. 

This problem of determining expansions of modular forms having poles 
in the upper half-plane was partially considered by Hardy and Ramanujan.‘ 
However they considered only forms of positive integral dimension which have 
no singularities at the parabolic points. The generalization to forms of real 
positive dimension presents no difficulties. We have only to introduce the 
roots of unity e(a,b,c,d) and e?™** in the transformation formulas (1. 11) 
and (1.12) below, and to carry them through the analysis. However in order 
to take care of forms having singularities at i900 we have to evaluate certain 
integrals which Hardy and Ramanujan were able to eliminate by means of 
simple estimates. This part of the work is contained in sections 3, 4, 5, and 6. 

In this paper we shall consider most of the integrals in the r-plane rather 
than in the z-plane, where «= e?™'7, The original path wy of section 2 is 
taken in the r-plane so that we may avoid the poles of /(r) which are easier 
to treat there than in the z-plane. After this point much of the work could 


* Received May 16, 1939. 

* Harrison Research Fellow. 

*“On the Fourier coefficients of certain modular forms of positive dimension,” 
Annals of Mathematics, vol. 39 (1938), pp. 433-462, especially section 1. 

*It is convenient to choose a particular fundamental region. We take the region 
7/21, —1/2=9(r) =1/2. and throughout this paper we shall refer to it briefly as 
the fundamental region. 

*“On the coefficients in the expansions of certain modular functions,” Proceedings 
of the Royal Society, A, vol. 95 (1919), pp. 144-155. 

ey: 


Ad 


‘ 


128 HERBERT S. ZUCKERMAN. 


be done in either plane but by keeping in the 7-plane we are able to maintain 
a closer contact with the properties of modular forms and to eliminate certain 
rather artificial parts of the proof. For example, instead of using the mediants 
of the Farey series to break up the path of integration we are able to choose 
a set of points which are more closely connected with the form whose expansion 
we desire. 

We write the transformation equations for our modular form as 


(1.11) b,c, d) (— -- d))*F(r), c> 


(1. 12) F(r+1) =e *F(r), OSa<l, 


where | ¢(a, b,c, d)|==1 and where the branch of (—i(cr-+d))~" is chosen 
as in the original definition of a modular form. 
From (1.12) we have 


(1. 2) +1)= (7), 


and hence e***7#'(r) has a Fourier expansion for each region in which it is 
analytic. Since f(r) has only a finite number of poles in the fundamental 
region, we can find a number A such that the only singularity of F(r) with 
S(7) 2A is at rio. To simplify the later notation we shall always take 
for A a value =1. Then, noting that P(r) was restricted to have at most 
a polar singularity at r 10, we see that we have a Fourier expansion 


90 
which is valid for 2 A, | «| S 
For all 7 in the upper half-plane we write 
f(z) — 


Then, within the unit circle, f(z) has a pole of order » at «= 0, poles at 
each point corresponding to the poles of F'(r), and no other singularities. In 


section 2 we shall determine a closed curve Cy which lies within the unit 
circle, encloses the origin, and does not pass through any poles of f(a). Then | 


if y is a point within C'y and not a pole of f(z) we have 


de = f(y) + BW), 


where R(N) is the sum of the residues of the function f(x) /(x— y) at the 
poles of f(z) which are enclosed by Cy. We then have 


( 
0 
t 
] 
( 
a 
W 
to 
| 
| ly 
thi 
ex] 
the 
| 
pap 
the 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 129 


J Oy 


from which to obtain our expansion. This expansion will consist of two 
parts. The part arising from R(N) corresponds to the Hardy and Ramanujan 
results while the part arising from the integral is analogous to the series 
obtained for forms having polar singularities at ico and no other singularities 
in the upper half-plane. Despite this similarity it is not true that these two 
separate parts are each modular forms of dimension r belonging to the full 
group. 


2. or the case in which F’(r) has no poles in the upper half-plane, the 
curve Cy could be chosen as the circle x exp {— 27N~*}. However this 
curve is not suitable in our case since we must find one that avoids the poles 
of f(z). The curve used by Hardy and Ramanujan can be used but it is 
geometrically more complicated than the one that will be used. 

We first determine a path wy in the r-plane and then take its image in 
the z-plane as Cy. For a positive integer N we let hs/ks be the s-th fraction 


in the Farey series® of order N, 


0) 1 Ns-1 hes 1 
1 ? N ? ? ke-1 ks N 1 


and we consider the transformations 


Rs 
where we define ho to be —1 and ky to be N. This transformation belongs 
to the modular group because of well known properties of Farey fractions. 
Under 7’, the point ic goes into hs/ks, 0 into hs_1/ks_1, and the line R(r) = 0 
from 0 to 10% goes into the semicircle, through the points hs1/ks-1 and hs/ks, 
lying above the real axis. The first quadrant of the r-plane is mapped into 
the area bounded by the semicircle and the real axis. If any poles of F(r) 
lie on the line #(r) =O we detour around them with small semicircles 
extending into the right half-plane and deform the semicircles through he/ks 
and hs_;/ks_, accordingly. These deformations will all extend downward into 
the semicircular area but will not reach the real axis. 

These deformed semicircles join at the points he/ks to yield a continuous 
path from 7 = —1/N to7=1. Finally we detour around the points hs/ke 


* Certain simple properties of the Farey series will be used without mention in this 
paper. For these properties and their proofs see E. Landau, Vorlesungen tiber Zahlen- 
theorie (1927), pp. 98-100. 


9 


1 


130 HERBERT S. ZUCKERMAN. 


along the line $(r) = B from each semicircle to its neighbor. The constant 
B, which may vary with N, is to be taken small enough so that the line will 
meet all the semicircles. It is also to be such that this line does not contain 
any poles of /(r) and we sball later add a further restriction. 


1 re) 1 1 1 
The path wy for the value N = 4. 


For wy we now choose the part of this path between the points 
—1/(N +7) and («+ N—1)/(N +17) which are the images of the point 
7 =1 under the first and last of the 7’. The path wy is entirely above the 
real axis and it does not extend further above it than the largest semicircle. 
The radii of the semicircles are 


<a 
and hence we have, for 7 on oy, 
1 
The end points —1/(N +7) and («+ N—1)/(N +17) of oy differ 
by unity and hence the image Cy of wy in the z-plane is a closed curve. Also, 
by (2.3), we find, for x on Cy, 


(2. 3) 0< B(r) < 


| x | (7) 1. 


and, therefore, Cy lies within the unit circle and approaches it as N— ~%. 
We can now write equation (1.4) as 


f(y) -f f(e***) (ere — R(N) 
WN 
(2.4) 
-f — y) — R(N). 
WN 


3. In order to evaluate the integral of (2.4) we break the path wy into 
parts. We let Q, be that part of wy which joins hss/hke+ + 1B to he/ks + iB 
for all s except the first and last, and for these values of s we let Q, be the 
corresponding remaining parts of wy. If we write 


1 
{ 
( 
d 
0 
1 
| 
ne 
of 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 131 


we then have 

WN 8 


where > denotes the sum over all s for which there is a corresponding Farey 
8 
fraction (2.1). In (3.11) we now make the change of variable 


hso Ris 


corresponding to the transformation (2.2). If r+ lies on the part of Q, that 
does not consist of the line 3(r) = B then a lies on the line R(o) = 0 or on 
one of the Cstours around a pole of P(r). If = then we have 


The path Qs. The path Qs. 


2 ‘. 1 2 1 2 
) (3¢6) ? 


so o lies on the circle with center at — ks4/ks + i/(2ks?B) and radius 
1/(2k.?B). It can easily be verified that the points corresponding to the end 
points of Qs are 


(x 


except for the first and last values of s, in which case one end point corre- 
sponds to oi. Then as 7 runs along Q,, o runs along the circle from 
— Bk?,_,/(Bksks. +i) to the line R(o) =0; along this line, making the 
necessary detours, until it again meets the circle; and then along the circle to 
—Is1/ks + We shall call this path By taking B sufficiently 
small we can keep the two points of intersection of the line and circle outside 
of the strip 1/A = Q(r) SA. We now have, using (1.11), 


— 


132 HERBERT S. ZUCKERMAN, 


hso + he-, hso + 


(exp { — y) (kso + 


—=— & 2mri(1 — a) 


x (exp (—1(kso + ) (0) dy 


where we have used the abbreviation 
(3. 23) e(hs, he-1, ks, ke-1). 


As we are later going to let N tend to infinity, we can suppose that it 

ree been chosen so large that the curve Cy wholly contains the circle 

'a|=—=(1-+|y|)/2 for the particular point y, within the unit circle, that 
we are discussing. We then have the inequality 


1—|y| 


(3. 3) 
for all x on the path Cy. The integrand of (3.22) can be written as 
— y)*(— i (hea + 


where 7, which is given by (3. 21), lies on Qs and «= e?**7 lies on Cy. Then, 
for the part of Q, in the strip 1/A S 9(c) S A we have R(o) = 0 and hence 


| | == (1-a)3(7) 
| kso + |? = (ke R(o) + + 
1 19 
= (hss + hi?) + ke?) 2 


Also we see that the path is of finite length, is independent of N, and is free 
of poles, so | F(c)| has a bound on this path. Combining these results we se¢ 
that the part of I, due to this part of ©, is 


| 
| 
] 
4 ¢ 
t 
( 

| 
(3 
| 
Or 
Th 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 133 


fe) Oo 
The path Q,’. The path 0,”. 


The remainder of is in two parts. The upper part we call Q,’ and 
the lower 2,’”. It is to be noted that 2,”” and Q,’ for the largest admissible 
value of s are missing. For o on 9,’”” we make the change of variable 
p=—1/o and let ©,” be the corresponding path in the p-plane. It can 
then be seen that Q,’ consists of the path R(o) =0 from Ai to the circle 
la + — i/(2ks?B)| = and then the arc of this circle on to 
the point — ks_1/ks + 1/(k,’B), while ,’” consists of an arc of the circle 
|p — 1/ (2k? =1/(2k?s,B) from ke/ks-1 + to the 
line Jt(p) = 0 and then the segment of this line to Ai. These two paths lie 
entirely above the lines §(o) = 3(p) =A and hence are free of the detours 
that we made to avoid the poles of P(r). We now have 


(3. 4) Is = — fo? Rat (1 — - a) (exp + 


x i(kso -+- ks-1) do 
f exp Rat (1 — ( exp y 
N-r-? 


N-r- 
= 


Applying the transformation equation (1.11) to H.’”” we find 


\ 
(3. 51) HA,” = exp ) a) if 
x (exp } | —y) dp, 
where 
(3. 52) = €(0,—1,1,0). 


On 0,” we have X(p) =A and hence the Fourier expansion (1.3) is valid. 
8 = 


Thus we may write 


| 


134 HERBERT S. ZUCKERMAN, 
(3. 53) H.” =I, + Ke’, 
where 
Res p he 
(3.54) = — eT exp 4 2mt(1— 
ks-ip — ks 


x (—- p— kg) ) tap 
p= 


(3.55) — f. exp Qari (1 — a) — he 
2s 


X (—1(ks-1p — ke) ) Ane?" 


n=0 


If p is on 0,” then r= is on oy and 
is on: Cy and we may write (3.55) in the form 


Kf? = — J. exp{2rt(1 — a)r}(a@—y)” 


@) 


n=0 


| | = | exp {2mi(1 — (a — y) — kee) 


n=0 
where w,’’ is the part of wy that r runs over as p runs over Q,”.. From the 
geometry of the path 2,” and properties of the Farey fractions we have 
N? 


= + k?, > (N — + k?, = 


Also we have 3(p) = A, the inequality (2.3) for 7, and (3.3) for 2, and hence 
(3. 56) K,” N-re- 27aA = | | e- f | dr ') 


In a similar way we find 
(3. 61) H’, =I’, + K’s, 
where 


(3.62) I's Es 2at(1 — a) | (exp 


p=1 


tate 


Resp — he 
x (exp y 
x (exp ( 
| 
pi 
| 
i 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 135 


and 


Now o,’ and ws” are both parts of Q, and they do not overlap. Hence we may 
combine (3.56) and (3.63) to obtain 


(3. 64) K, (; — 


4, We now consider the integrals J,’ and /.” of (3.62) and (3.54). 
In J,’ we make the change of variable 


o=i 


Zz 


ke ite’ 


and have 


(4.11) =— ‘tes f 2nt(1 a) + 


keg. 
x exp } (— +it)} $a exp | —2niv(—4 4 4, 
where the path of integration consists of the line §(z) =—hks_, from the 


point Ak, —iks. to the point at which this line meets the circle correspond- 
ing to the circular part of Q,’, and then an arc of this circle on to the point 
1/(ksB). 


In we set 


ks 
—i(kep—ky), 


and find 


exp ) + ) ay exp — | ud + dz, 


Kes-1 


where the path of integration consists of a circular arc and a straight line 
joining the points 1/(ks..B) and Aks. + 

We now wish to combine J’, and J’’s,; into a single integral. Since their 
paths of integration abut at the point 1/(k,B) it will be necessary only to 
transform the integrand of I’’s,,; to show that it is the same as that of I’,. 
It is because of this that we chose the path wy in such a way that the first 
and last Q, were incomplete. The effect of that choice is that 7’, and I’, 


| 
| 
t 


136 HERBERT S. ZUCKERMAN, 


for the largest admissible value of s are missing and hence when we sum the 
I, of (3.11) to obtain the integral of (2.4) we are to sum the expression 
(1’, + I’’s.:) over the s corresponding to all Farey fractions (2.1) with the 
exception of the last one, 1/1. 

In order to transform the integrand of I’’s,, we first note that from the 


property 
hgks-1 — he-sks 1, hessks — 1, 


of the Farey fractions, we have 


hs ks ‘ 


Using this, the transformation equations (1.11) and (1.12), and the 
definitions (3.23) and (3.52), we find 


kat + hes 
= (—— U(ker + ks-1) + — 


== €s41€9€7 exp {2arta he-rke+1) } (— -+- ) 


(4. 21) Nesrke-1 


and 


and hence we have 


— 


Equation (4.3) is obtained by comparing (4.22) with (4.23) and using 
(4.21). Using this result we may write (4.12) as 


f (7: 


x (cxp { G + 
xX exp (— 2) av exp — 2rw (— + ( dz, 


which we now combine with (4.11) to get 
he i\/ 
he a 
x (exp { (724 } -y) 


X exp (- + iz) | a_y exp — 


8 


(4. 


Ww 


( 

| 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 137 


The path of integration of (4.4) extends from Ak, — iks_, to Alte + ikeat. 
The exact form of this path need not be known. It is sufficient to note that 
if we trace back through the changes of variable that we have made, we find 
that as 2 runs over this path then 


the 
exp) 4) | 


runs over a part of Cy. By (3.3) we then have 


end we can expand the denominator of the integrand of (4.4) in a uniformly 
convergent geometric series of non-negative powers of y. Doing this and 
interchanging the orders of summation and integration we obtain 


(4. 5) +L — > D> av exp | — (n + a) — + (v— @) y” 
kes n=0 ke ke 


x f 2*-* exp (n+ x) dz. 


5. The integrand in (4.5) has no singularities except at the points 
z=0 and z= © so we can deform the path of integration. If n+a>0 
we cut the z-plane along the negative real axis and take the following path: 


R(z) = Ak, from Aky — to the point at which | z| = WM, 

|2 |= M from the point at which R(z) = Ak, to the point — M, 

a loop from — M around the origin and back to — M (on the upper border), 
from — M to the point at which R(z) = Akg, 

R(z) = Ak, from the point at which | z|—M to Aly + ths,:. 


The path of integration 
for + 1541. 


= 


138 HERBERT S. ZUCKERMAN. 


Calling the integrals along these paths J;, Jz, J;, Js, Js, respectively, we have 


|z|=M 
= 2x exp ) 


((—a)at, + (n+ a) iz) M-r-, 


and hence J, +J,—0 as Mo. Also, on the path of J; we have 
z= Ak, — yi, mM, 


Akg Akg 
| |? + = ky? + (N—k.)? = 
and hence 
Aks—ico 
| = exp | = a)Aks + (n+ @) 
Aks—iks-, 
Aks—ioo 
= exp{4AnN?(n + «)} f | |*| dz ) 
Aks—iks-, 


(Aks—iks-)71 


where, in the last integral, we have set w=1/z and where the path of 
integration is along an arc of the circle tangent to the imaginary axis at the 
origin and passing through the point (Ak; —iks.)~. The length of this path 
is at most 7/2 times the length of the chord: 


= — ik 2h -1 


and therefore we have 


J, = O(N" exp{4AxN?(n + @)}), 
and, similarly, 
J; = O(N" exp{4ArN*(n + @)}). 


Finally we have ® 


(0+) 
lim J; = f exp ((v—a)e + (n+ 


Qn r+1 (0+) ar? 


°G. N. Watson, Theory of Bessel Functions (1922), p. 181, (1). 


T 
Nc 
ine 
| 
j 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 139 


Combining these results and letting M tend to infinity we then find that the 
integral of (4.5) has the value 


+ O(N + «)}). 


If n+ we haven—a=—0. It has been shown in the first paper 
refered to in section 1 that «0 implies r= an integer. The integral can 
then be evaluated in this case, as was done in that paper, without cutting the 
z-plane. The result is that (5.1) may still be used provided we use the con- 
vention of that paper regarding its meaning in this case. 

We can now combine (2.4), (3.11), (3.4), (3.53), (3.61), (3. 64), 
(4.5), and (5.1) to obtain 


v=1 8 n=0 


(r+1) /2 in, 


))) 


The error terms in (5.2) reduce to 


(5.3) o(z{t exp(4AaN-2(n + @)} | y |" 


— O(N* exp(44aN*a} (exp(44aN} | y |)" 


n=0 


= (1 — exp{4AaN-*} | y 


Now it is easily seen that the length of the path wy has an upper bound 
independent of N. Also we have 


k-1 N 


” Loc. cit., footnote 2, sections 6 and 7. 


| 
ig 


140 HERBERT 8S. ZUCKERMAN. 


and therefore the error terms (5.3) have the estimate 
O(N~* exp{4ArN *a} (1 — exp{4AxN~} | y|)-*), 


and (5.2) may now be written as 


(5.4) f(y) exp } (— B+ (a) 


(r+1)/2 
— R(N) + O(N~ exp{4ArNa} (1 — exp{4ArN-*} | y |)-*). 
6. If we let N— o in (5.4) the error term will tend to zero and we 
shall have the desired expansion of f(y). In order to perform this we must 
free (5.4) of its dependence on the Farey series of order N. That is, we 
must replace the term 


Kea | 


8-1 
ky § 


€s EXp 2mi(v — @) 


by an equivalent expression involving hs and k, but not he-, and ks. It is 
convenient at this point to omit the subscript s, writing A and & for hs and k, 
but still using hs, and ks. We first define h’ to be any solution of the 


congruence 
(6. 1) hh’ =—1 (mod k), 


from which we have, at once, 
h’ + =0 (mod k). 

By (3.23) and (1.11) we have 

= 1(kr + (7). 
On the other hand we use (6.1), (1.11), (1.12), and the fact that he_s/hea 
and h/k are successive Farey fractions to get 
p(t 4 bes) (ke +h’) /k) — (14+ 

kr + lees + +h’) /k) —W 


X exp P(r). 
Comparing these two results we have 


=e @ + v’) exp i \, 


P 
] 
N 
as 
a 
(( 
te 
“ 
| in 
Tey 
| mé 
i 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 141 


and hence, since vy is an integer and k divides (h’ + ks-1), 


EXP 2ri(v— a) 


Using this we may now write (5.4) in the form 


N 


12 y—a \(r+1)/2 
(6.21) f(y) = 24 (= 


x Tra (FE ) 
+ exp{44aN-*a} (1—exp(44xN} | y |)*), 


where 


(6.22) > (a,—? 


Osh<k k 
(h,k)=1 


x exp ((n ta) b+ (v— a) 


If N-—o the first term of the right member of (6.21) becomes an 
_ infinite series which is easily seen to be convergent. The second term becomes 
R(«) which also stands for an infinite series. Since the error term becomes 
zero, this second series then converges, provided its terms are summed in the 
proper order. By (co) we shall mean the infinite series whose value is 
lim R(N). Then we have: 
N00 

THeorEM 1. Jf F(7r) is a modular form of positive dimension r, having, 
as singularities in the fundamental region, at most a finite number of poles 
and a polar singularity at io, then we have the expansion 


F(r) 


1x (r+1)/2 


XK (4 a) ) yt —R( 2), 


(6.8) f(y) 


k 


where Ayv(n) is defined by (6.22) and the remaining constants are de- 
termined by the transformation equations (1.11) and (1.12) and by the 
“principal part” of the Fourier expansion (1.3). 


7. We shall now obtain the series representation for (oo) in the case 
in which F(r) has a single simple pole in the interior of the fundamental 
region and no other singularities in this region except a possible polar singu- 
larity at ico. The value of R() in other cases can be obtained in a similar 
manner, with obvious modifications. 


( 


142 HERBERT S. ZUCKERMAN. 


We let F(r) have its simple pole at the point ro and suppose that 
its residue there is R. Then, expanding /'(r) about this point, we have 


n(t—o)". 


(7. 11) F(r) =— 


n=0 


Now if 7 lies in the triangle which is obtained from the fundamental region by 
the transformation 


y 416 a b 
7.12) 2: 


then (—dr-+ b)/(cer—a) lies in the fundamental region and we have, by 
RE), 


(7. 18) P(r) e(— d, b,c, (—1i(cr —a) 


CT 


In this triangle we have a single determination of the branch of (—1(cr—a) )* 
and therefore f(r) is analytic except for the parabolic points and the points 
(ao -+b)/(co+d). For 7 near the latter points we combine (7.11) and 
(7.13) to obtain the expansion 


F(r) = e(— d, b, c, — 1(cr — 


x | — 


—«(—d,b, = 


from which we see that F'(r) has a simple pole at (aa+ 6)/(co-+d) with 


the residue 


~1 . as + b ao + 6 
— «(— d, b, c,—a) o+d —a)) 


— —«(— d, b,c, —a)(—i(co + d))7R. 


Corresponding to this pole, f(x) = e°"'*7F(z7) will have a simple pole at 
the point 


ao +b 
(7. 21) exp (25 22 


with the residue 


(7.22) — 2a exp { — «) wt? e(— d, b, c, —a)*(—1(¢o + d)) 


o+d 
Not all transformations (7.12) will give distinct points (7.21). Since 


o is not a vertex of the fundamental region we shall get each point | 


(7 


80 


R 
| 
co +d 
at 
wl 
| 


at 


nce 


int | 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 143 


(ac + b)/(co +d) once and only once if we take the identity transforma- 
tion and all transformations (7.12) with c=1. Two of these points, 
(ao + b)/(co+d) and (a0+ b,)/(¢,o + d,), will yield the same point 
(7.21) in the z-plane if and only if their difference is an integer ¢, 


to+b, (a+ te)o+ (b+ td) 
cot+td co+d 


Therefore we can take each pair of integers p and q such that p=1 and 


+t= 


(p,q) = 1, choose a single solution p’ of the congruence pp’ =— 1 (mod q), 
and then take for our transformations (7.12), all the transformations 


1 pp +1 
(7.3) 0 1)? q 

q 


We also must consider the poles in the z-plane due to the parabolic 
points. The point r—7io corresponds to x0 while the other parabolic 
points correspond to points on the unit circle and hence are not included in 
R(«). For | z| small we have, by (1.3), 


f(a) — 


and hence, since y ~ 0, we find, at x = 0, 


1—a2/y Yon mo 


= Ony™. 


m=1 


From this, (7.22), (7.3), and the fact that y is distinct from the 
singularities of f(x), we find that the sum of the residues of f(x)/(«#—y) 
at all the poles of f(z) within the unit circle has the value 


— + __ y)-1 


m=1 


(p,q)=1 


where, in an, effort to simplify the notation, we are using the abbreviation 
(7.5) ((pp’ + 1)/qoty 

po+4q 


80 o’ is a function of the indices of summation as well as of a. 


th | 


144 HERBERT S. ZUCKERMAN. 


The infinite series (7.4) is absolutely convergent. To see this we first 
note that, since o is in the fundamental region, there are only a finite number 
of its images of the type (7.5) which lie above any fixed line parallel to the 
real axis and above it. Also we have 


| = 1. 


If y is in a region 0 < y% S| y| <1 then, leaving aside the finite number 


a<l, >9, 
of terms for which ; 
| io’ | <= + Yo 
— 9 > 
we see that the infinite series in (7.4) is majorized by the series 


which is known to converge for r > 0. 
Since (7.4) is absolutely convergent we can rearrange its terms to agree 
with our definition of R(o«). Then we have: 


THEOREM 2. If the only singularities of the modular form F(z) of 
Theorem 1, in the fundamental region, are a possible polar singularity at 1 
and a simple pole with residue R at the point o tn the interior of the funda- 
mental region, then the quantity R(o) of Theorem 1 has the value (7.4) 
which is absolutely convergent for any y inside the unit circle which 1s not a 
singularity of f(x). 


If F(z) has several simple poles the value of R(o) is the sum of 
corresponding expressions of the type (7.4). If F(r) has a simple pole at a 
vertex of the fundamental region the restrictions on p and q in (7.4) have 


to be strengthened. 


8. As in the case in which /'(r) is analytic in the upper half-plane,’ 
we can characterize the class of all functions which satisfy the conditions of 
Theorem 1. We shall find this characterization in a somewhat different way, 
basing it on a formula connecting the number of zeros and poles of a modular 
form. We suppose that F(r) has Z zeros and P poles in the fundamental 
region omitting the vertices, ico, 1, p= e7/*, and p®. Also we suppose that, 
at ico, f(r) has an expansion of the form (1.3) with the associated constants 
@ and p, while it has zeros of orders s and ¢ at i and p respectively. The integers 


8 Loc. cit., footnote 2, section 8. 


(; 
| F 
( 
ak 
eX 
| 
W 
on 
W 
i 
| I 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS, 145 


s and ¢ are to be taken negative if P(r) has a pole at the corresponding 
points. Then these constants are connected by the formula 


(8. 11) 
This formula can be found by considering the integral, 
1 
(8. 12) a(og F(r)) 
e 


taken over the path formed by the sides of the fundamental region and a line 
parallel to the real axis and sufficiently far above it. Circular detours are 
made around the points 7, p, p?, and any poles of F(7r); and then (8.11) is 
found as the limit of (8.12) as the horizontal line approaches io and the 
circular detours shrink down to their points. 

We now multiply /’(7) by factors to obtain a new function which has 
no zeros or poles in the fundamental region, including the vertices. To cancel 
the Z zeros and P poles, which we may suppose to be at the points py, ps, *-* 5 pz 
and 01,02," respectively, we use the factor 


t) —J(px)), 


(8. 21) @(r) =I (J ( 


which may introduce new zeros or poles at 190 but nowhere else. To take 


care of the finite vertices we use the factor ® 


(8. 22) ®(r) =(vi@—1) (7) 


Finally the factor »(7)*" will suffice for the point r—10. To see this we 


expand the function 

(8. 23) U(r) = (7) 

about the point io, using the known expansions of J(7) and (7), and the 
expansion (1.3) of F(z). We then have 


where we have used (8.11) to obtain the last equality. 
The form (7) is a modular function. To prove this we need consider 
only the two generators of the modular group. For the function (8. 21) 


we have 


° We use the usual determinations of the roots, loc. cit., footnote 2, p. 449. 


10 


6) 

| | 
| — 

| = 


146 HERBERT S. ZUCKERMAN, 


(8. 31) @(r+1)—O(r), (=*)- a(r), 


since J(r) is a modular function. The function (8.22) has the transforma- 
tion formula 


t 
(8.32) (7+ 1) —exp | (5+ +t) ®(7), (=) == 


and »(7)°" the formula 
(8.33) (7+ exp (2 mi 7) (— tr) "q(7)*. 


From (8. 23), (1.12), (8.31), (8.32), and (8.33) we then find 


(8. 41) W(7 + 1) = exp (2 + 13 + 


but from (8.11) we have 


an integer, and hence (8.41) reduces to 
(8. 42) W(r7-+1) = V(r). 
Also, from (8. 23), (1.11), (8.31), (8.32), and (8.33), we get 
(8. 43) (=) (zr), 


where 
(8. 44) = €(0,—1,1,0). 


Now the value of e) is known ™ to be 


r\ 
(8. 45) €) = exp ( r) ( 
and therefore we have, using (8.11), 


— exp = exp{2ni(— 3P — + 34 + = 


and hence (8. 43) reduces to 
—1 
(8. 46) )= 


10 Toc. cit., footnote 2, p. 449, formulas (8.63) and (8.64). 
11 Loc. cit., footnote 2, p. 445, formula (6.7). The proof used there applies without 
change to our case in which F(r) has poles in the fundamental region. 


j 
i a 
( 
| 
q 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 147 


Formulas (8.42) and (8.46) show that ¥(r) is a modular function. 
Since it has no singularities it is a constant, and, by (8. 23), we have 


(8.5) F (7) = 
p 
a= Kn(7)-** I] (J (7) 


TL 


x (VIG=1) (WIG). 


Any form I(r) that satisfies the conditions of Theorem 1 can then be ex- 
pressed in the form (8.5). Conversely, it is clear that any form (8.5) 
satisfies these conditions provided we have r > 0 and the oj and p, distinct 
from the vertices of the fundamental region. The numbers p and @ associated 
with this form may be obtained from (8.11). Since p» is an integer and 
<= 1 we have 


(8.61) 

and 


We can express the roots (8.22) in terms of g2(1,7), g:(1,7), and 7(7) 
and then write (8.5) in the form 


X g2(1, 7)’9s(1, I] (J —J(o;))* II (J (r+) —J(px)), 


where the constant K has changed its value. 


§. The discussion of the e(a,b,c,d), given in the paper ** referred to 
above, for forms analytic in the upper half-plane, applies directly to our case 


also. Hence we can immediately state the 


THEorEM 3. The modular form, (8.7), of dimension r, satisfies the 


fransformation equations 


(9.11) F b,c, d)(—1(er + d))*F (7), > 0, 
er+d 


and 


(9.12) F(r +1) = (7), 


=I, 
*? Loc. cit., footnote 2, section 9. 


148 HERBERT S. ZUCKERMAN. 


with the multiplier 
d 
(9.21) e(a,b,c,d) =exp r-s(a,c) — 120 


(b(a +d) +a(b—c) + bc) 


+ (a+ d)(b—c) (ad + ic) ) \ 
and with the « of (8.62), where 


Conversely, all modular forms satisfying the conditions of Theorem 1 are 


contained in (8.7). 
We can apply Theorem 1 to the form (8.7) and evaluate the Ayv(n) by 
means of (9. 21) to obtain 


THEOREM 4. Jf r > 0, the modular form (8.7), in which the p; and o; 
are distinct from the vertices of the fundamental region, has the expansion 


F(r) 
(9-31) f(y) = Sart (2) 
k=1 n=0 n+ 


where a and p are given by (8.61) and (8.62): the ay are the coefficients 
of in the expansion (7), valid for sufficiently large: 
R(«) is the sum of the residues of f(x)/(x—y) at the poles of f(x) within 
the unit circle, as described in section 6; and where 

(9.382) Axv(n) = or (h, k)E(h, k)* 


(h,k)=1 


with 
oy (h, k) = exp{2air: s(h, ky}, 


exp { xi (— +1)t, 


8 These are the same as the functions (9.54) of the paper referred to in footnote 
2. It should be noted that there is an error in the expression for é(h,k) in that paper. 


_ | ( 
| 
T 
| 
) Wl 
| | 
fj 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 149 


10. The modular form 
1 
(10. 11) I(r) = (J(r) 


affords an interesting example. We limit r to be positive and o to be distinct 
from the vertices of the fundamental region. Then F,(7) is of the form 
(8.7) with 

(10.12) Pond, 


and (8.61) and (8.62) yield the values 


r r 
Corresponding to (1.3) we have the expansion 


(10. 14) (7) go (r/12) (1 + + 24:7 + 
K (a1 + (744 — 1728/ (co) ) + 1968842 - 
= (a — (744 — 2r— 1728) (oc) 
= — (744 — 2r — 1728) 


and, therefore, the values 

(10. 15) a» = 1, = 1728) (0) — 7444 
If 0< 7r=12 we have » 0 and hence, by Theorem 4, 


The value of R(#) may be found by Theorem 2. The residue of f(r) at 
the point 7 =a is 
1 


(10. 3) fi 


no) (a), 


where J’(o) is the derivative of J(r) at + = o. The value of 
(— qs P's Ps meet) in (7.4) may be replaced by its value (9.21) and 
we then have 
(10. 41 ) ~2r (_ p2mi(1-a)o ( -1 
= £(p, q)"(— i(po + —y)}, 


p=1 
(p.q)=1 


a 


150 HERBERT S. ZUCKERMAN. 


with the « of (10.13) and where we have used the abbreviations (7.5) and 


(10. 42) exp { —2ni (s(—g, p) 


With this value for R(«) we then have the desired expansion of F;(7) in 
(10. 2). 

A formula concerning modular forms of negative dimension can be 
obtained from our expansions of /,(7). For §(7) > S(o) we expand each 
summand of (10.41) in a geometric series and then have, using (10. 2) 


00 
F,(r) 1728 (0) 447 > e2tin(T~o) 
8 n=0 


n= 

(p,q)=1 


On the other hand, by (10.14) with the values (10.13), we also have 
F,(r) = e?™497{1 — (744 — 2r —1728J(c) +}, 
valid for 3(r) > S(c). A comparison of the leading terms now yields the 
result 
(10.51) 
—— E(p,q)"(—i(po + q)) | 


p=1 
(p,q) =1 
for 0 < r= 12 and o inside the fundamental region. By analytic continua- 
tion the restriction on o can be removed and we have (10.51) for 3(o) >. 
For the case r 12 this formula simplifies considerably. Making use 


of the result 1* 


(10. 52) 128(h, k =i" (mod 1), 

for hh’ =—1 (mod k), we have, since — q- eo 1 (mod p), 
12s(— q; p) — + 1)/q (mod 1), 

and therefore 


(10. 53) t(p,q)*=1. | 


14H. Rademacher, “Zur Theorie der Modulfunktionen,” Crelle, vol. 167 (1931); 
pp. 312-336, in particular p. 321, formula (2.51). 


| 
( 
se 
pa 
| 
| 
| 


1), 


EXPANSIONS OF CERTAIN MODULAR FORMS OF POSITIVE DIMENSIONS. 151 


Then (10.51) reduces to the expression 


p=1 q=- 
(p,q)= 


which can easily be reduced to the more usual form 
> -14 
91113 (po + q) 


p=-00 


= 


Returning to our original example, (10.11), we now consider the case 
12< r= 24. We now have »=1 and hence, by Theorem 4 with the values 
(10.13) and (10.15), we find 


I(r) — (er), 


Ira (EVGA ) 


where A\")(m) has the value 


(10.62) AfP(n) = | — (—h’ + (n+ 2)h) 
(isk 

The value of (co) may again be found by Theorem 2. It is found to have 

the value (10.41) with the added tert 

For r = 24 the expression for F ier again simplifies. In this case we 


have a = 0 and hence, making use of (10.53), we find 


(10.71) = A 2) (n) 5/2) 9, 4. 
2a 
p=1 q=—-00 
(p.q)=1 
where, by (10.62) and (10. 52), 
OSh<k 
(hyk)=1 


It is of interest to ask whether the two parts of our expansions are each 
separately modular forms. We shall not answer this question completely but 
shall merely give some indications as to the answer by considering the 
particular example (10.71). We first write F'.,(7) = + H(7r) where 
G(r) is the part of (10.71) which is a power series in ¢?'7 and H(r) is the 


| 
d- 
0). 
se 


152 HERBERT 8S. ZUCKERMAN. 


remainder. The series G(r) converges for all + in the upper half-plane so 
G(r) has no finite poles. If G(r) is a modular form of positive dimension 
belonging to the full group then it can be seen’® from its expansion in 
(10.71) that it is of dimension 24. The characteristic constants of (G(r) 
then have the values r = 24, 2 =0, P=0, 720, and t= 0, 
However these values contradict equation (8.11) and therefore G(r) is not 
a modular form of positive dimension belonging to the full group. 

We now turn to the function (7). From its value in (10.71) we have 
H(r-+1)—H(r) and hence Also H (7) remains finite as trix 
and therefore we have p=0. Since P=1, 720, s=0 and t= 0 we see, 
by (8.11), that if (7) is a modular form of positive dimension r belonging 


to the full group, then we have p= 0, Z =0 and 


Ss 


Then by Theorem 3, H(r) is one of the forms 


where AK may depend on o. From our definitions of the functions we now have 


H(r) = n(r) (7) G(r), 


which we combine with (10.81) to obtain 


This equation must be valid for 746 and o within the fundamental region. 
By continuation it then holds for all 7 and o in the upper half-plane. From 
(10.83) we see that K is a modular form of dimension O in the variable o. 
However if we set 7 =o in (10. 82) we see that K is of dimension (24 —"). 
Therefore we have r= 24. his contradicts the condition (10.81) and we 
therefore see that //(7) is not a modular form of positive dimension belonging 


to the full group. 


UNIVERSITY OF PENNSYLVANIA, 
PHILADELPHIA, PENNSYLVANIA. 


15 Loc. cit., footnote 2, p. 458, footnote 18. 


i 
4 
| | 
an 


it 


THE EXPONENTIAL REPRESENTATION OF AUTOMORPHS OF A 
SYMMETRIC OR HERMITIAN MATRIX.* 


By WILLIAMSON. 


In a previous paper? the exponential representation of canonical matrices 
was studied. These canonical matrices are automorphs of the normal form of 
a skew symmetric matrix. The corresponding problem, when the skew sym- 
metric matrix is replaced by a symmetric or hermitian matrix, is considered 
here. Three cases are treated: when the matrix is symmetric over the complex 
field, when the matrix is hermitian, and when the matrix is symmetric over the 
real field. Of these three the last is by far the most interesting. The methods 
employed are similar to those of the paper quoted above and, as in many cases 
the proofs are practically identical, they will not always be given in detail. 

1. We shall consider square matrices over the real or the complex field 


and, if HT is such a matrix, we shall mean by //* either the transposed or else 
the conjugate transposed of H. Let H be non-singular and let H = H*, so 


that #7 is either symmetric or hermitian. If G—=-— G* and 
(1) C = exp(HG), 

then 

(2) CHC* =H; 


for CHC* = exp(1G)H exp(HG)* = exp(1G) H exp(G*H*) 
= exp(HG)H exp(— GH) = exp(1/G) exp(— HG) H = H, 


Since the determinant of a matrix is the product of the latent roots of the 


matrix, 
| exp(7G)| = exp(trace of 
If 
H = (hij) and G = (gi;), (4,9 == 1,2,- -, 0), 
trace (HG) => Dd = 
4-1 j=l j=l 


Therefore, when * denotes conjugate transposed, the trace of HG@ is a pure 
imaginary number and the determinant of exp(/7@) has absolute value one. 


* Received April 20, 1939. 

* John’ Williamson, “ The exponential representation of canonical matrices,” Ameri- 
can Journal of Mathematics, vol. 61 (1939), pp. 897-911. This paper will be referred 
to as I. 


153 


) 

fo) 

li. 

). 

Ve 

Ig 

| 


154 JOHN WILLIAMSON. 


In the other case, when * denotes transposed, t=? and therefore ¢ is zero. 


Consequently, 
(3) | exp(7G)| = exp(0) = +1. 


We shall determine here necessary and sufficient conditions, that a matrix 0, 
which satisfies (2), shall have an exponential representation of the form (1). 
If a matrix C satisfies (2), since H is non-singular, |C | |C |*—1. There- 
fore the absolute value of | C | is unity and, when * denotes transposed, 
|C|=+1. Asa consequence of (3), when H is symmetric, we must restrict 
our consideration to those matrices C, whose determinants have the value 
plus one. 
If A= HG, 
AH = HGH = — HA* = Hf (A*), 


where f(z) =—vz. Hence A is normal * with respect to 17. 


Let P be a non-singular matrix and let 


(4) PHP* = H, and PCP"! = 
Then, if CHC* = H, C,H,C*, = H,. Similarly, if C = exp A = exp(HG), 
C, = exp A, = exp(H,G,), where -G, = — G*;. 
Since 
(5) A, = PAP", 


A, is normal with respect to H,. Further the matrix C is also normal with 
respect to H and the matrix C, normal with respect to H,, where the defining 
polynomial is f(x) =a". For brevity, when equations (4) are satisfied for 
some matrix P, we shall write (H,C) ~ (H,,C,) and, when (4) and (5) are 
satisfied, (H, A) ~ (H,,A,). Since both the symbols ~ and = have all the 
properties of an equivalence relation we have, 


ReEsutt (a). A matrix C, which satisfies (2), has an exponential repre- 
sentation of the form (1), if,and only if, there exists a pair (H,,C,) ~ (H.C) 
and a pair (H,,A,) = (H,A), where C, = exp A,. 

Since canonical pairs (H,,A,) =~ (H,A) and canonical pairs (H,, C;) 
~ (H,C) are known, it is only necessary * to compare the matrices exp A; 
with the known matrices C,. 


* John Williamson, “Matrices normal with respect to an hermitian matrix,” 
American Journal of Mathematics, vol. 60 (April, 1938), pp. 355-373; “ Normal matrices 
over an arbitrary field of characteristic zero,’ American Journal of Mathematics, 
vol. 61 (April, 1939). These papers will be referred to as II and III respectively. 

* Cf. I, page 898. 


| 
St 
u 
T 
If 
u 
We 
m 
eX 
di 
mi 
mi 
Si 
Jor 
H a) 
| 
| 
i 


AUTOMORPHS OF A SYMMETRIC OR HERMITIAN MATRIX. 1d0 


2. Hermitian case. Let H be a non-singular hermitian matrix over 
the complex field and let H* denote the conjugate transposed of H. Since C is 
normal with respect to H and f(x) =z", the canonical forms (H,, C,) ~ (H, C) 
can easily be written down as particular cases from the general results of II 
or III. The actual forms of H, and C, are, however, not necessary. It is 
sufficient for our purposes, that the matrices H, and C, are similarly par- 
titioned diagonal block matrices. These blocks depend, though not always 
uniquely, on the elementary divisors of C —AF or, as we shall say, on the 
elementary divisors of C. These elementary divisors are of two distinct types ; * 


Type (i): the pair (A—a)’, (A—a)", where |a| 1, and 
Type (ii): (A—a)", where |a| =1. 


Since the matrices of the canonical pair are similarly partitioned diagonal 
block matrices, the general case reduces to the consideration of two particular 
cases; that, in which C has a single pair of elementary divisors of type (i), 
and that, in which C has a single elementary divisor of type (ii). In the 
first of these cases the canonical pair (H,,C,) ~ (H,C) is unique; in the 
second there are two non-equivalent pairs (pX,,C,), where C, and X, are 
unique but p= + 1. 

The matrices of the canonical pair (H,, A;) = (H, A) are again similarly 
partitioned diagonal block matrices depending on the elementary divisors of A. 
These elementary divisors are of two distinct types; 


Type (a): the par (A—p)", (A+ where pA — p, and 
Type (B): (A—p)", where p=— p. 


If A has the single pair of elementary divisors of type («), A, and H, are 
unique. The matrix exp A, has the single pair of elementary divisors (A—a)’, 
(A— az)", where a = exp p. Since p~A— jp, |a| 1. Further, if |a|~1, 
we can always determine p—loga where p+—~p. Consequently every 
matrix C with the single pair of elementary divisors of type (i) has an 
exponential representation of the form (1). If A has the single elementary 
divisor of type (8), the canonical pair (H,, A,) ~ (7, A) is not unique. The 
matrix A, is unique but H,; = pY,, where p= +1 and Y, is unique.° The 
matrix exp A, has the single elementary divisor (A— a)", where a= exp p. 
Since p= — p, | a| = 1. Conversely, if | a| = 1, p=loga satisfies p= — p. 


‘II, page 360; John Williamson, “Quasi-unitary matrices,” Duke Mathematical 
Journal, vol. 3 (December, 1937), no. 4, page 414. 
*The matrix Y, may he taken to be the same as the matrix X, in the canonical 


pair H,, C, of type (ii). 


y 

1 


156 JOHN WILLIAMSON. 


Since there are two distinct canonical pairs Y,, A, and — Y¥;, Ai, we get two 
distinct canonical pairs + Y,, exp A, for the pair H, C, where C has a single 
elementary divisor of type (ii). Therefore we have the theorem: 


THEOREM 1. Jf ( is a conjunctive automorph of the non-singular her- 
milian matric H, C =exp(HG), where G is antt-hermitian® 


If H = HF, the identity matrix, C is unitary and we obtain the well known 
corollary: 


Corottary 1. Every unitary matrix U may be written in the form 


U = exp G, where G ts anti-hermitian.’ 
Since, if H is hermitian, i// is anti-hermitian we also have the corollary: 


CoroLuary 2. If C is a conjunctive automorph of the non-singular anli- 
hermitian matric H, C = exp HG, where G is hermitian. 


3. Complex field. Let // be a non-singular symmetric matrix over the 
complex field and let * denote transposed. The canonical pairs H,,C, and 
H,, A, are now uniquely determined by the elementary divisors of C and A 
respectively. As in § 2 the general case of a matrix CU can be deduced from 
two particular cases; that, in which C has a single pair of elementary divisors 
(A—a)", (A—a™)’, and that, in which C has a single elementary divisor 
(A + 1)***", The two particular cases, from which the general case of a 
matrix A may be deduced, are those, in which A has a single pair of elementary 
divisors (A—p)", (A+ p)" and a single elementary divisor A***1. It follows 
immediately that every matrix C with a single pair of elementary divisors 
(A — a)", (A— a)" does have an exponential representation of the form (1), 
as does a matrix C with the single elementary divisor (A—1)**1, On the 
other hand a matrix C with a single elementary divisor (A + 1)***? does not 
have * an exponential representation of the form (1). We therefore have the 
theorem, 


TuEorEeM 2. Let C be an automorph of the non-singular symmetric 
matrix I over the complex field. The matrix C has an exponential repre- 
sentation of the form C = exp HG, with a skew symmetric G, if, and only ‘f, 


° This answers for the case of finite matrices a question raised by Aurel Wintner, 
“Uber die automorphen Transformationen beschriinkter nicht-singuliirer hermitescher 
Formen,” Mathematische Zeitschrift, vol. 39 (1933), page 263. 

7 Aurel Wintner, “Spektraltheorie der unendlichen Matrizen,” (Leipzig, 1929), 
page 217. 

* This is obvious directly, since 


C|=—1. 


( 
( 
(! 
(1 
| Wl 
| (1 
th 
th 
au 
Tey 
inc 
thi 
an 
fo 
4 


AUTOMORPHS OF A SYMMETRIC OR HERMITIAN MATRIX. 157 


no elementary divisor (X + 1)***1 occurs an odd number of times among the 
elementary divisors of C. 


Let C have the single elementary divisor (A+ 1)", where r= 2k +1. 
Let L, U, and T;, be respectively the unit matrix, the auxiliary unit matrix 
and the counter unit matrix of order r. Then, if ® 


(6) X, = [1,—1,- ‘,(—1)""]T,, 

we may take 7, = XY, and C, =T,, where 

(7) = — exp U,. 

If B, is the diagonal block matrix given by B, = [b/;, 1, bEx], 
B,X,B*, = X;, 

and therefore, if 

(8) — B,T,, 


D, is an automorph of , The elementary divisors of D, are obviously 
(A+ b)*, (A+ and (A-=+-1). Further 
(9) Lim D, = T,. 


bol 


If C has only the elementary divisors (A+ 1)", where 7 = 2k; + 1, 
t= 1,2,---,s, we may take 


where T, and V are defined by (6) and (7). If 
(11) D = [D,,, Dy: +, Dr,], where D, is defined by (8), 


then, as a consequence of (9), 

Lim D = C;. 

The elementary divisors of D are the s pairs (A-- 0)", (A+ 071)" and 

the s elementary divisors (A+ 1). If s is even, D has, as a consequence of 
Theorem 2, an exponential representation exp(H,@,). Now, if C is a proper 
automorph of ZH, so that |C! is +1, and C does not have an exponential 
representation of the form (1), the matrices H,, C, of the canonical pair must 
include as submatrices the matrices given by (10), where s is even. If, in C,, 
this submatrix be replaced by the matrix D in (11), the resulting matrix has 
al exponential representation exp(J/,G,) and its limit as b> 1 is C,. There- 
lore we have the theorem: 


*See II, page 356, or III, page 337, or I, page 903, footnote 13. 


| 
— 
| 
| 
e | 
| 
e 
IC 
f, 
er 


158 JOHN WILLIAMSON. 


THEOREM 3. Jf C is a proper automorph of the non-singular symmetric 
matrix H over the complex field, C is either of the form exp(HG@), where G is 


skew symmetric or is the limit of automorphs, which are. 


It is of course obvious that, if |C|=—=—1, C~LimD, where 
D=exp(HG). 

If C is not representable in the form (1), we may write C, = [C2, C3], 
where all the elementary divisors of C, are of the form (A+ 1)” while none 
of C. are of that form. If /; is the unit matrix of the same order as C;, the 
matrix J = [.,—H;] is an automorph of H, and is of period two. The 
matrix JC, is an automorph of H, and, since no latent root of JC, has the 
value minus one, JC, = exp(//,G,). Accordingly we have the theorem; 


Turorem 4. Jf C is an automorph of H, which does not have an 
exponential representation of the form (1), there exists an automorph D of H, 
such that DC does have such a representation. The automorph D is of 
period two. 

If C is proper, the number of elementary divisors of D of the form A + | 
is always even and we therefore have 

Corottary 1. Jf C is proper, the matrix D of Theorem 4 has an 


exponential representation of the form (1). 


4, The real field. Let // be a non-singular real symmetric matrix and 
let * denote transposed. The general canonical pair (11,,C,) ~ (7, C) can 
again be deduced from that of several simple types of the matrix C. The 
simple matrix C has elementary divisors of the following types: 

Type (i): asingle pair of real elementary divisors (A—a)", (A—a")". 

Type (ii): the four elementary divisors 

(A—a)", (A—a), (A—a4), (A— [a 1, 

Type (ili): a single pair of elementary divisors 

(A—a)", (A—4)’; ja|=—1, 
Type (iv): the single elementary divisor (A + 1)***. 
In types (i) and (ii) H, and C, are unique but in types (iii) and (iv); 


while C’, is unique, 1, = pX,, where p= + 1 and X, is unique.” 
The general canonical pair (H,,A,) = (H, A) can also be deduced from 


10 TI, page 371, or III, page 351. 


div 
Fu 
bot 
the 
par 
me} 
is 
It i. 
mus 


() 
fo 
( 
of 
(A 
Is 
on 
ty] 
| or 
| 


AUTOMORPHS OF A SYMMETRIC OR HERMITIAN MATRIX. 159 


that of several simple types of the matrix A. The simple matrix A has 
elementary divisors of the following types: 


Type (a): a single pair of real elementary divisors 
(A—p)", (A+ 
Type (B): the four elementary divisors 
(A—p)", A—p) AtD), pAD, 
Type (y): a single pair of elementary divisors 
Type (8): the single elementary divisor **", 


In types (~) and (8) the matrices /7, and A, are unique, while in types 
(y) and (8), A, is unique but 17; —pY¥.1, where p=+1. The canonical 
forms are of course all real. 

If A is of type (a), expA has the single pair of elementary divisors 
(A—a)", (A—«')", where, since p is real, a= exp p is positive. If A is 
of type (8), the matrix exp A has the four elementary divisors (A—a)’, 
(A—a@)", (A—a')", (A—a@")", where a = exp p. Since the real part of p 
is not zero the absolute value of a is not one. The number a is real if, and 
only if, the imaginary part of p is an integral multiple of z, in which case 
a= and ad", Therefore, if A is of type (8), exp A is a matrix C of 
type (ii) or else a matrix C with exactly two equal pairs of elementary divisors 
of type (i). In this last case it should be noted that a may either be positive 
or negative,"! 

If A is of type (y), the matrix exp A has the single pair of elementary 


divisors (A a)", (A where a = exp p and, since p=— p, |a| = 1. 
Further, if |a@|— 1, and a is not real, we can always determine p to satisfy 
both of the equations a= exp p and p=—p. Since H,—pY,, p=+1, 


the matrices H, exp A have two distinct canonical forms. If the imaginary 
part of p is an integral multiple oi 7, a= + 1 and exp A has the two ele- 
mentary divisors (A + 1)", (A+ 1)". In this last case, where r is odd, exp A 
is a matrix C with only two equal elementary divisors, both of type (iv). 
It is important to notice that the two p’s associated with the elementary divisors 
must be equal.?” 

Finally, if A is of type (8), exp 1 is a matrix C of type (iv), with the 


“Cf. I, page 906. 
“Cf. I, pages 907 and 908. 


160 JOHN WILLIAMSON. 


single elementary divisor (A—1)***?. Once again there are two distinct 
canonical pairs = (H, exp A). 

The simple matrices C, which do not have an exponential representation 
are therefore, 

(a) a matric C with the single pair of real elementary divisors (A — a)’, 
(A — a)", where a is negative and distinct from minus one ; 

(b) a matrix C with the single elementary divisor (A + 1)***; 

(c) a matrix C with the pair of elementary divisors (A + 1)***!, where 
one of the associated p’s has the value plus one and the other the value 
minus one. 

An illustration of (c) is the following. 

IfC = 0—1 and H = ae C and H are in canonical form. The 
elementary divisors of C are (A+ 1) and (A+ 1), and the first associated p 
has the value + 1 and the other the value —1. Let G@ be the real skew 


symmetric matrix 4} Then exp(HG) = exp ) and this 


cannot be equal to C. On the other hand, if H = ‘ : , so that the two 
] ( 

associated p’s have both the same value plus one, C = exp i. r)( - 


In this last simple example a change in the associated p’s altered the 
value of H. That this does not always happen is shown by the following 
example. Let C=[1,—1,—1] and H =[—1,—1,1]. The elementary 
divisors of C are (A—1), (A+1), (A+1). The first p associated with 
d+ 1 has the value —1 and the other the value + 1. Therefore there is no 
real skew symmetric matrix G, such that C = exp(HG). On the other hand, 
the matrix H has the same elementary divisors as C and [—1,—1,1] 


= exp(HG@), where G = 0]. 


Comparison of the canonical forms clearly shows that case (c) is 4 
limiting case of case (a) as a tends to minus one. 

Since a matrix C, whose only elementary divisors are two equal pairs 
(A—a)", (A—ac)", always does have a real exponential representation of 
the form (1), when a ~ — 1, we have the theorem; 


THEOREM 5. Let C be a proper real automorph of the real non-singular 
symmetric matric H. Then the matrix C has a real exponential representation 
of the form C =exp(HG), with a skew-symmetric G, if and only if, every 
real elementary divisor of the form (A—a)" where a is negative, occurs an 


( 

tl 

0 

of 
ex 

m 
tw 
me 
th 
ele 
re} 
ar 
an 
be 

pos 
| 
| the 
div 
| If 
| ¢ 
simy 
the 
Hen 


AUTOMORPHS OF A SYMMETRIC OR HERMITIAN MATRIX. 161 


even number of times among the elementary divisors of C and, when a=—1 
and r is odd, the number of positive p’s associated with (A—a)? 1s even. 

We next determine those matrices C which can be obtained as limiting 
cases of matrices with an exponential representation. Let C be of type (1), 
where a is negative, and r= 2k is even. Then (, is of the form 


A; 0 0 
T, = and H,= 0 


where A, = al, + U,. If By = A,— b?V*,, where is obtained from U, by 
replacing the units in the even numbered rows by zero, and D, = [B,, (B*,)"], 


then Lim B, = A, and Lim D, =T,. The matrix )), is obviously an automorph 
b—0 


of H, and, since all the latent roots of D,; are complex, D, has a representation 
of the form exp(//,G,), as long as b is different from 0. If r=2k+1, 


and F’, -( + , where ¢ is the column vector of dimension 2k defined by 


e* = (0,0,0,---,1), Lim F,—A,. The latent roots of F’ are all complex, 


except for one, which has the value a. If K,=[F,, Kr is an auto- 
morph of H,, and has elementary divisors which are all complex except for the 
two simple ones (A—a) and (A—a™"). If C has only the s pairs of ele- 


mentary divisors (A — a)", (A — a)", 1, 2,- 8, where rj = 2k, 4 1, 
then = and C, = Lim K, where K = [K,,, +, Kr,]. 
b-0 


The elementary divisors of K are all complex except for s pairs of simple 
elementary divisors (A—a), (A—a). If s is even, K has an exponential 
representation of the form K = exp(H,G,). If s is odd, K does not have such 
a representation and K is not the limit of a matrix which does; for otherwise, 
a matrix C with the single pair of elementary divisors (A —@) (A — a) would 
be the limit of a matrix with an exponential representation, and this is im- 
possible. Consequently, if C is a matrix with only elementary divisors of the 
form (A—«a)", (A—a')", where a is negative, C is the limit of matrices of 
the form exp(//G), if, and only if, the total number of pairs of elementary 
divisors for which r is odd, is even. 

We now consider matrices C with elementary divisors of the form (A+ 1)". 
If C is a proper automorph of H, with only s elementary divisors of the form 
(A+ 1)***', by the same argument that was used in the complex case, (§ 3), 
(= Lim D, where all the elementary divisors of D are complex except for s 
simple elementary divisors (A+ 1). With each elementary divisor of C of 
the form (A + 1)?**!, there is, in the canonical form, associated a p= + 1. 
Hence with C there is associated a set of sp’s. In a canonical form for D, H, 


1] 


| 
| 


162 JOHN WILLIAMSON. 


with the s elementary divisors 4+ 1 of D is associated the same set of sp’s. 
Since C is a proper automorph of H, s is even and D has an exponential repre- 
sentation, if the number of positive p’s in the set is even. If the number of 
positive p’s is odd D does not have such a representation nor is D the limit 


of a matrix which does. For otherwise, the matrix ( 0 1, would be a 
0 expd 


. On combining these results 
expd 0 


limit of matrices of the form ( 


we have the theorem: 


THEOREM 6. Lel C be a proper real automorph of the real symmetric 
matric I. Then C has a real exponential representation of the form (1) 
or is the limit of real automorphs which do, if, and only if, when a is a negative 
latent root of C, the total number of elementary divisors of C of the form 
(A — a)***1 is even and, when a =—1, the total number of positive p’s asso- 


ciated with these elementary divisors is even. 
Finally, by the same proof as that of Theorem 4, we have 


TnEorEM 7. If C is a real automorph of the real symmetric matric H 
and C does not have an exponential representation of the form (1), there 
exists a real automorph D of H, such that DC does have such a representation. 


The automorph D 1s of period two. 


If C is improper, D is improper and D cannot have an exponential repre- 
sentation. But, even when C is proper, it is not possible in every case to find 
a D, which does have an exponential representation. This is best shown by a 


simple example. Let 


J 0 0) 
and H=(}_}). 


If D is to be proper and of period two, D must be + C. Therefore D = C and 
D does not have an exponential representation. On considering the proof of 
Theorem 4, we see that D has an exponential representation if, and only if, 
the number of positive p’s associated with C, is even. If H is definite, all p’s 
must have the same value and in this case D always has a representation of 
the form (1). 


5. Lorentzian matrices. We now consider in more detail automorphs 
of the non-singular symmetric matrix H of order n and index n—1. If ¢ 
is an automorph of H, the elementary divisors of C must all be linear. 
most one pair of real elementary divisors (A —a), (A— where | a| £1, 
can appear among the elementary divisors of C. If no such pair occurs, 


At 


Sé 


whi 
of 
div: 
Acc 


n 
mat 


a 
¢ 
u 
n 
W 
(1 
( 


AUTOMORPHS OF A SYMMETRIC OR HERMITIAN MATRIX. 163 


(’ must have one elementary divisor (A +1), with which is associated a 
p=—1. The p’s associated with all other elementary divisors of C all have 
the value +1. We therefore have 


THEOREM 8. Let H be a real non-singular symmetric matrix of order n 
and indec n—1. If C isa real proper automorph of H, C = exp(HG@) with 
a skew-symmetric G, unless C has a pair of elementary divisors (A—a), 
(A— a), where a is negative, and, when a= —1, the associated p’s have 
different values. 


In other words C has an exponential representation of the form (1) 
unless C has a pair of elementary divisors (A—a), (A—a-'), where a@ is 
negative and |a|-=41 or is the limit, as a tends to —1, of an automorph, 
which has such a pair of elementary divisors. From theorem (6) we deduce 


the Corollary ; 


CoroLLary 1. No automorph, which does not have an exponential repre- 


sentation, is the limit of automorphs, which do. 


If n is even and (H,C) ~ (H,,C,), we may take 


The matrix C, is a diagonal block matrix with blocks of the form + 1 or 


( where a? + B?=1. The matrix C, is a diagonal matrix, 


—B a 
(4% 0 
ive 


If a —1, each elementary divisor A+ 1 of C is associated with a p, 
which has the value plus one. Therefore C has an exponential representation 
of the form (1), unless a is negative. When a is negative, each elementary 


divisor A+ 1 of —C is associated with a p, which has the value plus one. 
Accordingly —C does have an exponential representation and we have the 
theorem : 


THEOREM 9. Jf H is a real non-singular symmetric matrix of even order 
nand index n —1, and, if C is a proper aulomorph of H, at least one of the 
matrices C' or —C has an exponential representation of the form (1). 


It is of course obvious that no such theorem is true if n is odd. 


| 


164 JOHN WILLIAMSON. 


In conclusion ?* we exhibit the canonical forms H,, C; and the exponential 


representations of C,, when n= 4 and |C,|=-+1. If 
0 1 
Le 
C, is of a single type; 

a 0 0 0 b 
0 I1/a 0 0 n 

C; 
0 cos@ sin @ 
0 O0O —sin#@ cosé 
The matrix C, = exp(H,G,), where : P 
0 —loga 0 0 01 
G log a 0 0 0 i of 

_ ‘hen a is positive. 
0 0 0 |? When s positiv th 
0 0 —6 0 

80 
As remarked earlier, if a is negative, C, does not have such an exponential B 
representation. ve 
in 
THE JOHNS HOPKINS UNIVERSITY. 
th 
po 
ho 
ey 
of 
co 
rC 
clo 
po! 
cor 
of 
kor 
12Cf, C, C. Macduffee, The Theory of Matrices (Berlin, 1933), page 68; F. D. bine 


Murnaghan, “On the representation of a Lorentz transformation by means of two- 
rowed matrices,” American Mathematical Monthly, vol. 38 (1931), pp. 504-511. 


UNBOUNDED CONVEX POINT SETS.* 


By J. J. STOKER. 


1. Introduction. We are concerned here with the properties of un- 
bounded convex point sets S in three-dimensional Euclidean space H*, though 
many of the theorems and proofs are obviously valid in #”. In addition to 
convexity we make the following assumptions on the sets S: a) S is not the 
entire H*, so that boundary points of S exist, b) S is assumed to possess inner 
points in #*, c) S is a closed set. Frequent use will be made of the following 
well-known property of all convex sets (bounded or unbounded) : there exists 
on every boundary point of S at least one support plane (German: Stiitzebene) 
of S, that is, a plane containing the boundary point and having the property 
that S lies entirely in one of the two closed half-spaces bounded by the plane.* 

A special case of unbounded convex sets, the convex cone, is treated in 
some detail because of its importance in the discussion of the general sets S. 
By a convex cone we mean a closed convex set C consisting of infinite half- 
rays all emanating from the same point O, the vertex of the cone. However, 
in dealing with the cones C it is not convenient to assume that C must possess 
inner points in /#* or even in H?, but we explicitly omit the case in which C is 
the entire #*. It is hence clear that the vertex O of C is always a boundary 
point of C relative to #*. The terms inner and boundary point are used, 
however, in dealing with the cones C, in relation to the dimensionality of C, 
even though C is always considered as laid in 2, ‘This terminology is free 
of ambiguity since C' is convex.? 

With every set S there is associated a unique cone C, the characteristic 
cone of S, defined as follows: Through any point pC S all infinite half-rays 
rC § are drawn. The resulting point set is shown to be not empty and to be 
closed and convex, ie. it is a cone (. ‘Two such cones erected on different 
points pj C S are shown to differ only by a translation. The characteristic 
cone plays a central réle in the discussion of the sets S. 

Most, though not all, of the theorems on convex cones as well as the idea 
of the characteristic cone are contained in the paper of Steinitz: “ Bedingt 


konvergente Reihen und konvexe Systeme,” Journ. f. d. reine u. angew. Math., 


* Received March 8, 1939. 
*Bonnesen-Fenchel, Theorie der konvexen Kérper (1934), p. 4. The proof given 
here applies to bounded sets, but could be extended easily to unbounded sets. 
* Bonnesen-Fenchel, loc. cit., pi 2. 
165 


166 J. J. STOKER. 


Bd. 143 (1913) ; Bd. 144 (1914) ; Bd. 146 (1915). The theorems of Steinitz 
are formulated in a terminology which is not suited to our purposes, and since 
his proofs can be replaced by others quite concise, we shall include proofs of 
these theorems in order to preserve the continuity of the discussion. 


It is well known that the set of boundary points of a bounded convex set 
with inner points in E* is homeomorphic with the surface of the sphere. The 
corresponding problem for unbounded sets S is more complex, as the following 
simple examples indicate: 1) The space between and on a pair of parallel 
planes—the set of boundary points is not connected, 2) a solid right circular 
cylinder with an entire straight line as axis—the set of boundary points forms 
an open cylinder, 3) a half-space—the set of boundary points is a plane. One 
of our main purposes will be to show that these examples exhaust all. possi- 
bilities in so far as the topological structure of the boundary points of S is 


concerned. 


The spherical image of a convex set is defined in the usual manner: The 
outward normals on all support planes of S are displaced parallel to themselves 
and erected at a point 0; the points of intersection of such normals with the 
unit sphere having O as center constitute a set J, the spherical image of S. 
In the case of bounded convex sets, J is the entire surface of the sphere. We 
prove that J of any unbounded set S lies on a closed hemisphere, that every 
inner point of the spherical image /, of the characteristic cone C of S is a 
point of 7, and give condicions under which J of S is a closed or an open set. 

At the close of this paper we prove the following theorem: There exists 
a support plane 7 of S on a certain boundary point b of S such that the 
infinite half-ray taken along the inward normal to T at b lies entirely in S if, 
and only if, the set of boundary points of S is homeomorphic with the plane. 
We obtain also a sufficient condition that the set of boundary points of 8 
may possess a representation in the form z= f(z,y) with f one-valued and 


continuous. 


2. Convex cones. "\'o the definition of the convex cone C already given 
above we add the definition of the cone C, polar to C: On every support plane 
of C (all of which evidently contain the vertex O of C) we erect the normal 
turned away from C at O. The totality of all such normals (considered as 
infinite half-rays) forms the polar cone C’, of C. Since O is always a boundary 
point of C (relative to E*) it is clear that C, is not an empty set. It is also 
easy to show that C> is closed and convex, i.e. it is also a convex cone. (See, 
for example, Bonnesen-Fenchel, loc. cit., p. 4). 

We begin the discussion of the cones C with 


pe 


CO. 


| 
a 
t 
( 
il 
mn 
st 
It 
all 
cla 
ent 
wer 
But 


UNBOUNDED CONVEX POINT SETS. 167 


THEOREM I. The polar cone Cy, of the polar cone Cy is identical with 
the original cone* CU. 


It is clear that the above definition of the polar cone C’, can be given in 
the following form: the necessary and sufficient condition that a half-ray rp 
should belong to C, is that << (1, 7p) = /2 if r is any half-ray belonging to C. 
If, then, * C C, it follows at once that 7 C (,, from the above definition of 
the polar cone; hence we have C C Cp,. On the other hand, if r CC, it is 
clear that a support plane 7 of C can be found which will separate r from C. 
(Here we use the fact that C is a closed set.) The outward normal rp on T 
at the vertex O of C belongs to C> by the definition. Hence < (7,1) < 2/2, 
and 7 can not belong to Cp». Hence from r € C, follows r CG Cp, and we have 
thus completed the proof that Cp =C. 

It is convenient to divide the cones C into the following classes, in which 
(is: (1) an entire straight line (that is, the entire #'), (2) an entire plane 
(that is, the entire #*), (3) any of the remaining possibilities. It is further 
convenient to subdivide class (3) into three additional sub-classes, i. e., those 
in which: 


(a) C possesses inner points in #', but not in H*. C can be only a single 
infinite half-ray: if C possessed more than one such half-ray, it would either 


possess inner points in H?, since C' is convex, or it would consist of an entire 


straight line, both of which cases are to be excluded. 

(b) C possesses inner points in #? but not in H*. C can be only the 
convex portion of the plane between and on two infinite half-rays emanating 
from the same point, since C is convex and can not be the entire plane. 


(c) C possesses inner points in £*. 


It is clear that these classes and sub-classes are mutually exclusive and exhaust 
all possibilities in the %. 

It is of interet to note the nature of the polar cone (’, in each of the above 
classes. In cases (1) and (2) this is quite simple: C’, for class (1) is the 
entire plane, evidently, i.e., Cp is of class (2); Cp for class (2) is an entire 
straight line, i. e., Cp is of class (1). 


Lemma 1. The polar of a cone of class (3) 1s itself of class (3). 


This follows from Theorem I: If the polar Cp of a cone C of class (3) 
were of class (1), say, then (pp would be of class (2) as we have just seen. 
But C=C)», which shows that our assumption is absurd. It is clear, also, 


* See Steinitz, loc. cit., vol. 144, p. 10. 


168 J. J. STOKER. 


that C, could not be of class (2), by the same argument. Since C, is not 
empty, it must be of class (3). 

From now on, in this section, we shall consider cones of class (3) only. 
Evidently this class is distinguished from the other two by the following 
property: The vertex O of C is a boundary point of the cones of class 3), but 
is an inner point in classes 1) and 2), according to our definition of the terms 
inner and boundary points when applied to cones C. We prove now a theorem 


on cones of class (3): 


THroreM II. There exisls a support plane T of C with the following 
property: the inward normal (that is, the normal turned toward C) on T al 
O lies in C. 


Since C and C, are both of class (3), by Lemma 1, the point O is a 
boundary point of both sets. It is obvious from the definition of C, that C 
and C’, have no points in common except O. It follows at once that C and C; 
possess a common support plane through * 0. The outward normal rp (relative 
to C) on T at O belongs to Cy, the inward normal r on T at O to Cpp, hence 
rCC since C= Cpp. 

We can present this theorem in a sharpér form, as follows: 


THEOREM ITI. There exists a support plane T of C with the following 
property: the inward normal on T at O contains (with the exception of 0) 


only inner points of C. 


We know from the preceding theorem that there exists a support plane 7 
and an inward normal r on it at O, such that rC C. If r contained an inner 
point of C, then our theorem would be proved, clearly. If r contained no 
inner point, it would lie in the boundary of C. There would hence exist a 
support plane 7’, containing r, which would be perpendicular to 7. Through 
O we take a third plane 7, perpendicular to both T and T,. If 7, contains 
an inner point p of C, the ray Op lies in the interior of C and, in addition, 
a plane 7; through O normal to Op would clearly be a support plane of U: 
the ray Op would hence possess the required property. If 7 contained no 
inner point of C, then T, would necessarily be a support plane of C, and C 
would lie in the convex portion of space between three planes mutually at right 
angles. In this case it is clear that every ray in the interior of C would have 
the required property. 


8. The characteristic cone. Since any set S is unbounded, there exists 


an unbounded sequence of points p,, in S. Consider any point 


* Bonnesen-Fenchel, loc. cit., p. 4. The same remark applies here as in footnote 1. 


\ 
h 
e 
d 
| 
ti 
at 
uy 
is 
On 
an 
CO 
po 
bo 
AC 
| 


UNBOUNDED CONVEX POINT SETS. 169 


pC. S being closed and convex, all line segments ppv as well as all limit 
points of such segments belong to S. The half-rays ppv cut the unit sphere 
with p as center in points gv which possess a limit point g. The half-ray pq 
is made up entirely of limit points of the segments ppv. We have then 


LemMA 2. LHvery point pCS contains at least one infinite half-ray 
8. 


A half-ray with this property we denote from now on as an azwal half-ray. 


LemMa 3. Jf r is an axial half-ray on point pC S and p any other point 
of S, the half-ray 7 on p parallel to r is also an axial half-ray of 8. 


For if p,;, Po,* * *, Pv is an unbounded sequence of points of 7, the line segments 

ppv lie in S and all points of 7 are limit points of these segments. Hence 

8. 


Lemma 4. The set C of all axial half-rays emanating from a point pC 8S 
is a convex cone, the characteristic cone of S. 


That C is convex is shown as follows: Let p,; and p. be any two points of C. 
We must show that the segment pip, @ C. This is evidently the case if the 
half-rays r, = pp, and rz = pp2 are identical or opposite in direction, or if 
either p, or ps is identical with p. In any other case the plane convex sector 
defined by 7, and rz belongs to S since S is convex. All half-rays in this 
sector which emanate from p belong to C, and with them the segment pp. 


From Lemmas 2 to 4 we conclude: To every point pCS there exists a 
characteristic cone with p as vertex, and all such cones go inlo one another by 
translations. 


4. Topological structure of the boundary points of S. Consider any 
inner point pS and an infinite half-ray r, going out from p, which contains 
at least one boundary point of S. On going out from p along 7 one must come 
upon a first such boundary point, say b, since the set of boundary points on r 
is closed and also bounded on the side toward p. Any support plane P of S 
on 6 cuts the ray r at b as it would otherwise contain p, an inner point of S, 
and this is manifestly impossible. The line segment pb is thus the set of points 
common to S and 1, all of them being inner points except the unique boundary 
point b. We conclude in 


Lemma 5. A half-ray drawn from an inner point of S contains a unique 
boundary point of S, or it lies entirely in S and therefore belongs to the char- 
acteristic cone of S. 


— 


170 J. J. STOKER. 


Let p be an inner point of S, K the unit sphere with center at p, and ( 
the characteristic cone of S with vertex at p. The points of intersection of C 
with the surface of K we denote by C, the remaining points on the sphere by 
B, i.e. B is the complement of C relative to the surface of K. From Lemma 5 
we conclude: B is the set of points formed by the intersection of the surface 
of K with infinite half-rays drawn from p to the set B of boundary points of § 
and the correspondence thus set up is one to one. Since the characteristic cone 
C for any S is independent of the point pC S chosen to define it, the set C, 
and with it B, is uniquely defined by S.C being a closed set and not the 
entire L°, it follows that C is not the entire surface of K and is closed and 
that B is a non-empty open set. 

The sets B and B are homeomorphic, that is, the correspondence set up 
between B and B by central projection is not only one to one, as we have seen, 
but also continuous in both directions. We show the continuity first in the 
direction B-—>»B. Consider any point b CB and let bCB be the point 
corresponding to b. We have to show that to any set b; C B and having b as 
limit point corresponds a set b; C B with 6 as limit point. ‘To prove this 
it is sufficient to show that the set b; possesses a limit point b* on the half-ray 


pb, for, assuming the existence of b*, it is clear that b* C B and Lemma 5 
shows that b and b* are identical. The existence of b* is readily shown: The 
boundary point b contains a support plane P of § on one side of which 0; as 
well as p lie; we can construct a finite cone with vertex on p and base on P? which 
will be bounded and contain if not b; itself at least an infinite sub-sequence 0’; 
of bj, from which, if necessary, a further sub-sequence can clearly be taken 
which will converge to a limit point b* on pb. The existence of b* is thus 
assured and with it the continuity in the direction B—B. The continuity in 
the direction B — B can be shown in a similar manner; in fact, this is simpler, 
since it is evident a priori that every infinite set 6; C B possesses a limit point. 

Tio sum up, we have seen that B, the complement of C relative to the 
surface of K is uniquely determined by S, and .is homeomorphic with the set 
of boundary points B of 8. The problem of determining the possible topo- 
logical structure of the boundary points of S is thus resolved into the following 
problem: Determine the possible topological structures of the open sets on the 
surface of the unit sphere K which are obtained by removing from the surface 
of K its intersection with any convex cone with vertex at the center of K. 

Before continuing with the solution of this problem, it is of interest to 
note that Theorem I, Lemmas 2 to 5, and all that we have shown in this sec 
tion with the proofs, as given, are valid for the #”, independent of n. 


tin 


i 

t 

t 

i 

0 

= 

ce 

ce 

sl] 

po 
ra 

th 
cil 
op 

a] 

Ds 

as 
str 

if 


UNBOUNDED CONVEX POINT SETS. 171 


We consider the classification of cones C given in section 2 above. These 
classes were those in which C is: (1) an entire straight line, (2) a plane, 
(3) all other cases. In case (1), C is made up of two diametrically opposite 
points of the sphere and B is homeomorphic with an open cyclinder. In the 
second case © is a great circle on the sphere and B is homeomorphic with two 
distinct planes. 

In the third case it is convenient to consider a further division of the 
cones into sub-classes in which C possesses: (a) inner points in H+ but not 
in H?, (b) inner points in #? but not in £%, (c) inner points in H%. In sec- 
tion 2 we saw that C in case (8a) is a single half-ray; hence C is a single 
point and its complement B is homeomorphic with the plane. We saw also 
that C in case (3b) is a convex sector of the plane; C is a closed segment of a 
great circle and again B is homeomorphic with the plane. 

In the case (3c) it is clear that C possesses inner points relative to the 
surface of the unit sphere. We shall show that C is convex on the sphere, that 
is, C lies on a hemisphere and with any two of its points also contains a great 
circle are of length =7 joining the two points. An immediate consequence 
of this is that C is simply connected, and, since it is also a closed set, its 
complement B relative to the surface of the sphere would be homeomorphic 
with an open circle, hence also with the plane, which is what we wish to show. 

We have, then, to show that C in the case under consideration is spheri- 
cally convex. This follows from the convexity of C. The vertex of C (and 
center of the sphere) is a boundary point of C, a support plane of C' at this 
point exists, and, since C C (, it follows that C lies on a hemisphere. Con- 
sider any two points p, and p». which belong to C and which are not at opposite 
ends of a diameter of the sphere (that such points exist is clear since C 
possesses inner points on the sphere). The entire convex plane sector formed by 
rays from the center of the sphere to p, and p»2 belongs to C; the intersection of 
the sector with the sphere belongs te C: it is thus clear that the shorter great 
circle are joining p, and p. belongs to C. If C contained no diametrically 
opposite points, the spherical convexity would be proved. If C should contain 
a pair of diametrically opposite points p, and p., it would also contain a point 
ps different from these. The great circle are joining p, with p, and p, would, 
as above, belong to C. Hence C is spherically convex. 

With this we have also determined completely the possible topological 
structures of the boundary points B of sets S in H*. Summing up, we have 


THEorEM IV. The set of boundary points of a set S in E* possesses one 
of the three following topological forms: (1) an open cylinder, (2) two dis- 
tinct planes, (3) a single plane. 


172 J. J. STOKER. 


It is of interest to characterize in more detail the sets S in cases (1) and 
(2) of the above theorem: 

Case (1). We have seen that the characteristic cone in this case is a 
single straight line Z. Consider the set of points P common to S and a plane 
perpendicular to Z. Since such a plane contains no axial half-ray, it follows 
that P is a bounded set. It is also, of course, closed and convex. By Lemma 3 
there exists through every point of P one and only one straight line (parallel 
to L) which lies in 8. 8S is thus an infinite cylinder. 

Case (2). The characteristic cone is a plane. The set S can clearly be 
generated by moving a plane parallel to itself through a finite distance. S is 


therefore the space between two parallel planes. 


We can now state an evident corollary to Theorem IV: Jf the set S in E° 
is not the space between two parallel planes, the set of its boundary points is a 
connected set. This corollary is true for sets S in the #”, and although we 
confine ourselves here in general to sets in H*, it is of interest to give a proof 
of this fact ° valid in the #”. This is readily done with the aid of the following: 


Lemma 6. Consider a closed unbounded convex set S with inner points, 
all of whose boundary points lie on a pair of parallel planes T, and T., with 
at least one boundary point on each plane. Then S is the space between the 
parallel planes. 

From the Lemmas 2 to 5 (valid for any dimension) we conclude: the char- 
acteristic cone of S is a plane parallel to 7, and T, and all points of 7, and 1’, 


are boundary points of S. This proves the lemma. 


Turorem V. Let S be any closed unbounded convex set with inner 
points which is not the space belween parallel planes nor thé entire space. 
Any two boundary points of S can be connected by a continuous plane curve 
lying in the boundary of 8. 

Let b, and b2 be any two boundary points of S, 7; and 7, support planes 
on these points. We may clearly assume without loss of generality that 7; 
and 7’, are different. Two cases are to be distinguished: (a) 7, and 7’, are 
not parallel, (b) 7, and 7, are parallel. 


(a). 7, and T, intersect. S lies in the convex portion of space bounded 
by and which contains b>. Consider a two-dimensional plane con- 
taining b, and b, and cutting the intersection of T; and T, in point a. We are 
free to assume that bb. contains an inner point p CS, since otherwise b,b: 


5 This result, but not our Theorem IV, is due to Steinitz, loc. cit., vol. 146, p. 10. 


consi 
can | 


i 
t 
m 
in 
wl 
on 
by 
to 
| po 
| it 
nol 
at 
the 
giv 
Sin 
bet 
ven 
cone 
two 
infix 
is a 
“cir 


UNBOUNDED CONVEX POINT SETS. 173 


itself belongs to the boundary of S$. From p lines are drawn to all points of 
the segments b,a and aby. Every such line clearly contains a single boundary 
point of S, and we see without difficulty that 6, and bz are connected by a con- 
tinuous plane curve lying in the boundary of S. (See the proof for homeo- 
morphism of the sets B and B in section 4). 


(b). 7; and 7, are parallel. Lemma 6 and our fundamental assumption 
insure the existence of a boundary point b which does not lie on 7; or T, and 
which must lie between 7’; and 7, 8 being convex. Any support plane of S 
on b must clearly intersect 7, and T,. The proof that 6, and b. are connected 
by a plane curve lying in the boundary of S may now be conducted in exactly 
the same way as in (a). 

5. The spherical image of S. Consider any boundary point b of S, 
together with the characteristic cone C and its polar cone Cp erected at this 
point. Any support plane 7 of S on b must be a support plane of C also: 
it follows at once from the definition of Cp given in section 2 that the outward 
normal n on 7' must lie in Cy. We erect C and (,, with their common vertex O 
at the center of the unit sphere and denote by C, the intersection of Cp with 
the surface of the sphere. From the definition of the spherical image J of S 
given in section I and from the foregoing we conclude: J C Cy. 

We proceed to investigate the relations between J and C, in more detail. 
Since C, is the spherical image of C, we are in effect investigating the relations 
between the spherical image of § and that of its characteristic cone. It is con- 
venient to introduce at this point the same classification of convex cones dis- 
cussed in section 2 and apply it here to the polar cone C, of the characteristic 
cone of 8: 


(1) C, is an entire straight line, C is a plane, and S the space between 
two parallel planes. In this case J and Cy are evidently identical. 


(2) C, is a plane, C is an entire straight line perpendicular to it, S an 
infinite cylinder erected on a bounded convex plane set and again I = Cy (each 
is a great circle on the unit sphere). We use here the known result that the 
“circular image ” of a bounded convex set in the plane is the circle: circular 
image of the plane section of S and spherical image of S are identical. 

(3) All other cases, subdivided as follows: 

(a) C, possesses inner points in #! but not in L?. We know that Cp 
consists of a single infinite half-ray. The cone C is thus a half-space and one 


can easily show that S must also be a half-space. It is then evident that 


— 
_ 


174 J. STOKER: 


(b) Cp possesses inner points in #? but not in #*. Cp is the convex 
space between two half-rays. The cone C is a wedge: that is, the convex space 
between and on two half-planes having a common boundary. (The two planes 
are at right angles to the boundary half-rays of Cp). The set S is easily seen 
to be an infinite cylinder with generators parallel to the edge of the wedge and 
having as base an unbounded convex set in the plane, in contrast with case (2) 
of Theorem IV in which the cylinder is erected on a bounded set. The set 
C, is a closed segment of a great circle (at most a semi-circle) and the spherical 
image of S is clearly the same as the circular image of the plane, unbounded, 
convex set upon which the cylinder is erected. It is convenient to defer 
further consideration of this case until after discussion of the next case, which 


is exactly analogous for three dimensions. 


(c) Cp possesses inner points in #*. Cp possesses inner points rela- 
tive to the surface of the sphere. We shall show that every inner point of C, 
is a point of the spherical image of S. Let s be an inner point of Cp, P the 
plane through the common vertex O of C and C, which is perpendicular to the 
line joining O with s. P is (1) a support plane of the cone C which (2) con- 
tains no point of C except its vertex 0; (1) P is a support plane of C' by the 
definition of Cp, and (2) P contains no point of C except O, since it would 
otherwise contain an entire half-ray r C C which would clearly make an angle 
< 7/2 with some of the half-rays of Cp which pass through the points of a 
neighborhood of the inner point s C Gp, in contradiction with the definition 
of Cy. Consider next any inner point p C S and the characteristic cone C of 
S erected on p as vertex. The set of points common to § and that one of the 
two half-spaces bounded by a plane 7’ parallel to P and not containing 
we denote by S’. S’ is evidently convex; it is moreover bounded, since 
every infinite half-ray emanating from p and lying in the _ half-space 
containing S’ contains a boundary point of S because of the choice of T. 


The set 8’ possesses inner points in H* (since p is an inner point of S) and | 


T is a support plane of this set. By a well known theorem on bounded convex 
sets there exists a second support plane 7” of- parallel to T. 7” is clearly 
also a support plane of S. The outward normal on 7” is parallel to Os (the 
direction of the outward normal on T relative to C). Hence point s belongs 
to the spherical image of S, as was to be shown. 

We can now consider case (b). The method of proof used for case (¢) 
is not valid here without change, since no support plane of C exists which 
contains only the vertex of C because of the fact that C contains an entiré 
straight line in its boundary—the “edge” of the wedge—and every suppol! 
plane of C must contain this line. However, as remarked above, the spherical 


| 
i 
| 
d 
tl 
h 
ar 
se 
€ 
as 
de 
tic 
ot] 
as 
set 
nel 
to 
par 
casi 
the 
we 
whi 
dete 
not 
| bou 
| | 


UNBOUNDED CONVEX POINT SETS. Lio 


image of S is the same as the circular image of the unbounded convex plane 
set which consists of the intersection of § with a plane perpendicular to the 
edge of the cone C. We might call this set S*. A discussion exactly analogous 
to that of the above but carried out one dimension deeper would show that 
every inner point of Cp, would be a point of the circular image of S?. (The 
term inner point means, of course, inner point relative to the great circle which 
contains Cp). We sum up these results in 


TuEeOREM VI. The spherical image I of S is contained wm the spherical 
image C, of its characteristic cone. If Cpis one-dimensional or two-dimensional, 
at least every inner point of Cy (this term being used with reference to the 
dimensionality of Cp) is a point of I. In all other cases I is identical with Cp. 


The cone Cy possesses always a support plane through its vertex. From 
this we deduce an evident corollary to the above theorem: I of S lies on a 
hemisphere. 

At this point a natural question arises: Under what conditions will I be 
an open set (that is, contain none of the boundary points of Cp), or a closed 
set? The cases in which this question remains open are the cases (3b) and 
(3c) above, I being identical with Cy in all the others. Additional restrictive 
assumptions must be made on the sets S in order to answer this question 
definitely : Consider the convex set which is bounded by a paraboloid of revolu- 
tion: the spherical image of this set is evidently an open hemisphere; on the 
other hand, a semi-infinite right circular cylinder possesses a closed hemisphere 
as spherical image, though the cone C and with it C, are the same for both 
sets—a single half-ray and a half-space. An example of a set for which I is 
neither open nor closed could easily be given. 

We begin our discussion of this problem with the case (3b), which reduces 
to the consideration of an unbounded convex set S* in the plane whose char- 
acteristic cone is the convex sector between two half-rays. If C? of S? is in 
particular a half-plane, S? is also a half-plane, as one readily shows. In this 
case the circular image J of S? and the circular image Cp of C? are evidently 
the same—a single point on the unit circle. If C? is not a half-plane, which 
we assume from now on, then C, clearly possesses inner points in J’, all of 
which as we have seen belong to J. We wish to obtain conditions on S? which 
determine whether the two boundary points of CG, belong to J or not. That 
Cy has only two boundary points is sufficiently clear. It is of importance to 
note explicitly that the boundary points of Cy are determined by the two 
boundary rays of CO, and that the latter are the two half-rays at right angles to 
the boundary rays of C. 

Suppose that a boundary point 6 C G, belongs to J. As remarked above, 


176 J. J. STOKER. 
the half-ray 0b is perpendicular to an axial ray 7 C C; at the same time 0b 
is parallel to the outward normal on a certain support line LZ of 8*. LZ contains 
a half-ray (parallel to 7) by Lemma 3, all points of which must be boundary 
points of S* since J is a support line. We conclude: If 8 contains no infinite 
half-ray in its boundary, J contains no boundary point of Cp, i.e., J is an 
open set. 

On the other hand, suppose that there exists a circle with radius sufficiently 
large that all boundary points of S* outside of the circle lie on infinite half-rays 
belonging to the boundary of S*. We may choose a circle with this property 
with its center at an inner point pC S*. On p as vertex we erect the cone (. 
The boundary rays of C intersect the circle say in points g; and q2. The angle 
between pq, and pqs is definitely < 7, since C is not a half-plane. The out- 
ward normal n to ( (regarded as a half-ray) erected at g, must contain a 
boundary point b of S? (since x ZC), which must, in addition, lie outside 
the circle, n being a tangent to the circle. By our assumption, there exists a 
certain half-ray 7, starting from } which lies entirely in the boundary of 5°. 
It follows that 7, lies on a support line of S*, and, by Lemma 3, r, must be 
parallel to a ray in the boundary of C’; if it were in the interior of C all points 
of rT, except b would be inner points of the characteristic cone C erected on b 
as vertex and hence also inner points of S*, which is not possible. Moreover, 
r, must be parallel to pq.; if it were parallel to pq (the only other possibility) 
it would necessarily intersect pq,, since the angle between n and ry would be 
> 2/2, evidently. Again we see that 7, would contain inner points of C and 
consequently also of S?, which is impossible. Hence 7 is perpendicular to Te 
It follows at once that the point of J corresponding to 7 is one of the two 
boundary points of Cp. In the same way, by considering an outward normal 
to C at point gz, one shows that the other boundary point of Cy also belongs 
to J. J is therefore closed. 

We pass to consideration of case (3c), that in which C, possesses inner 
points relative to the surface of the unit sphere. Theorem III insures the 
existence of a support plane 7 of Cy with the following property: the inward 
normal 7; on 7’ at the vertex O of C, contains (with the exception of 0) only 
inner points of Cy. Since the polar cone Cy» of C, is identical with C (‘Theorem 


II), it follows that the outward normal No on T at O lies in C. Because of 


the fact that 7; lies in the interior of Cy, the plane T contains no point of ¢ 
except 0. Let p be the point of intersection of the inward normal on T with 


sp 
th 
Tl 
pr 
po 
pe 
rig 
po 
the 
est: 
tha 
cor 


Spo 


ass 
alsa 
in t 
set 
pens 
it fe 
of 


in 


( 
I 
0 
0 
d 
¢ 2 
W 
of 
al 
4 
4 
| 
i |_| 
| 


UNBOUNDED CONVEX POINT SETS. 177 


the surface of the unit sphere: p is thus an inner point of C, and hence a point 
of I by Theorem VI. Consider any boundary point of Cp, say g. The points 
pand q determine a great circle on the sphere, a closed and connected segment 
of which belongs to C,, since Cp is convex. One boundary point of this seg- 
ment is g while p lies in its interior. 

We consider the set S, consisting of the orthogonal projection of all 
points of S on the plane P determined by the points O, p, and q. (As C> lies 
on a hemisphere with p as an inner point, it is clear that p and q do not lie 
on a diameter of the sphere and O, p, and q determine a unique plane). Sp is 
unbounded, since P contains an axial half-ray of S, i.e. the half-ray in the 
direction 7%). Sp clearly possesses inner points in the plane, but is not the 
entire plane, as S possesses a support plane parallel to 7’ (p being a point of J), 
which is by construction perpendicular to P. As we have seen, the plane 7 
contains no ray of the characteristic cone C of S; consequently the intersection 
of S with any plane 7; parallel to 7’ is a closed bounded set. Since the 7; 
are perpendicular to the plane of S,, we conclude that the correspondence 
established between S and Sp is such that lo each boundary point of Sp corre- 
sponds at least one boundary point of S. S being unbounded, it is not obvious 
that S,, the projection of 8 on a plane, is a closed set even though S is closed. 
This we show as follows: Let p be a limit point of a set pyC S,. We must 
prove that p is the projection of some point of S on the plane P. Consider a 
circle with p as center which contains an infinite number of points p’y of the 
points pv. As we have seen, the intersection with S on any plane 7; per- 
pendicular to P is a bounded set. It follows that the intersection of S with the 
right cylinder erected over the circle with p as center is also bounded ; a set of 
points in S corresponding to the p’v C 8, must then possess a limit point ps» on 
the projection ray through p and p, belongs to S since 8 is closed. This 
establishes the fact that S, is closed. The projection S, of S on the plane P is 
thus a closed unbounded convex set such that to each boundary point of Sp 
corresponds at least one boundary point of S. Also, support lines of Sp corre- 
spond to support planes of S perpendicular to the plane of S, and vice versa. 

We may now apply to S, the reasoning used above for the sets S*. The 
assumption that g, a boundary point of Cp, belongs to J of S and consequently 
also to the circular image /, of S, insures the existence of an infinite half-ray 
in the boundary of S,,as we have seen, and hence the existence of an unbounded 
set of boundary points of S all lying in a certain support plane 7, of S per- 
pendicular to the plane of S,. The intersection S,* of 7’, with S being convex, 
it follow that S,2 contains an axial half-ray. Since q is any boundary point 
of lof S we may conclude: J of S is an open set if S possesses no half-ray 


in its boundary. 


12 


178 J. J. STOKER. 


On the other hand, assume now that a sphere exists such that every sup- 
port plane of 8 on any boundary point outside the sphere contains at least 
one half-ray in common with S. Let by be any boundary point of S, which 
corresponds to a boundary point b of S outside such a sphere. We show that 
by lies on an infinite half-ray 7, in the boundary of Sp): There exists a support 
plane 7’, of S on b perpendicular to the plane of Sp. 7, contains an infinite 
half-ray 7 belonging to the boundary of S by our assumption, and r can not 
be perpendicular to the plane P containing Sp since, as we have seen, the 
characteristic cone C of S is such that none of its rays are at right angles to P. 
It follows that the projection of r on P is an infinite half-ray which evidently 
lies in the boundary of Sp. The spherical image of Sp is thus a closed set, as 
we have seen above, and we conclude that the point q in the boundary of C, 
belongs to J of S. The point q being any boundary point of C;, it follows that 
all boundary points of Cy, belong to J and I is closed. We thus have 


TuroreM VII. Jf I of S is of dimension two, tt is (a) an open set if 
S contains no infinite half-ray in its boundary, (b) a closed set if every support 
plane of S outside a certain sphere lies on an infinite half-ray belonging to 


the boundary of S. 


If J of S is of dimension one, the above theorem does not hold without 
modification, as the following example shows: Consider the set S consisting 
of the convex portion of the plane bounded by a parabola together with all 
straight lines through such points at right angles to the plane. The spherical 
image I of S is evidently an open semi-circle, though the condition in (a) of 
Theorem VII is violated and that of (b) is fulfilled. However, as we have seen 
in dealing with the sets S? above, Theorem VII holds in this case also if it is 
applied, with obvious changes in the terminology, not to S itself but to the 
plane section of S which is normal to the plane conatining I. We have 
already seen that J of S in this case is identical with the circular image of 
such a section. 

6. Additional results. We begin this section with a theorem which 


gives a characteristic property of the sets S whose boundary points are homeo- 


morphic with the plane: 


TuHrorEM VIII. If the set of boundary points of S is homeomorphic 
with the plane, there will exist a support plane T on a certain boundary point 
b of S with the following property: the infinite half-ray taken along the inward 
normal to T at b (that is, the normal turned toward S) lies entirely in S. 


It is clear that this property is not shared with the sets S whose boundary 


| 
z 
| p 
| 
| 
st 
p 
se 


UNBOUNDED CONVEX POINT SETS. 179 


points are homeomorphic with a cylinder or a pair of planes, since such sets are 
geometrically right cylinders or the space between parallel planes respectively. 

Any set S whose boundary points are homeomorphic with the plane 
possesses a characteristic cone of class (3), as we have seen in the course of 
proving Theorem IV. It follows (Lemma I, section 2) that the polar cone Cp 
of C is also of this class. We may therefore apply Theorem III of section 2 
to C,. This theorem insures the existence of a support plane 7, of Cy such 
that the inward normal n on 7’; at the vertex of C, contains, with the exception 
of the vertex, only inner points of C,. The outward normal on T, at the 
vertex lies in C’, by Theorem I and the definition of the polar cone. Since .n 
contains an inner point p of Cp, it follows that 7, is parallel to a support plane 
T of S, since pC TJ of S by Theorem VI. The inward normal on 7 at the 
boundary point through which it passes is thus parallel to a ray belonging to 
( and consequently belongs to 8S, which proves the theorem. 

For certain sets § it is possible to find a plane P with the following 
property: the orthogonal projection of the set B of boundary points of S on 
P is such that a one to one correspondence between the two sets of points is 
established. It is clear that only the sets S whose boundary points are homeo- 
morphic with the plane can possess this property. In fact, only certain sets of 
this type possess it, as the example of an infinite half-cylinder shows. We have, 
however, in the following theorem: 


THrorEM IX. If the characteristic cone U of S possesses inner points in 
KS, a plane P can always be chosen to serve as x, y-plane of a set of orthogonal 
cartesian codrdinales such that the points of the set B will be gwen by 
2=f(2,y) with f one-valued and continuous. 


Consider any half-ray r which lies in the interior of C. We show that a 
plane P at right angles to r has the required property. By Lemma 3, there 
exists an infinite half-ray parallel to r through every point 6 C B which lies 
entirely in S. This half-ray contains with the exception of 6 only inner 
points of S, since r lies in the interior of C. It is moreover clear that 6 is the 
only boundary point of S lying on the straight line containing r. This is 
sufficient to show that the projection of the points of B on P is one to one. One 
shows also without difficulty that f is continuous: for example, one might use 
practically the same method as that used at the beginning of section IV. One 
sees also that the domain of definition of f is the entire plane. 


NEw YorK UNIVERSITY. 


| 


ON THE SMOOTHNESS PROPERTIES OF A FAMILY OF 
BERNOULLI CONVOLUTIONS.* 


By Erpos. 


Let L(u,o), <u<-+ denote the Fourier-Stieltjes transform, 
ie.) 

f e“rdg(x), of a distribution function oa. Thus if 
“00 
B(x) is the distribution function which is 0, $,1 according as «S-~-1, 
—l<r=1,1<~2, then L(u,8) =cosu; and so, if b is a positive con- 
stant, cos (u/b) is the transform of the distribution function B(bx). Hence, 
if a is a positive constant, the infinite convolution 


= B(ax) * B(a2x) * B(atr) 


is convergent if and only if a > 1; its Fourier-Stieltjes transform being 

(1) L(u, oa) = I cos (u/a"), (a>1). 
n=1 


It is known? that the distribution function o, is continuous for every 
a > 1 and, in fact, is either absolutely continuous or purely singular, depend- 
ing on the value of a. In this direction it is known? that the set of points z 
in the neighborhood of which oa(z) is not constant is either the interval 
x = a/(a—1) or a nowhere dense perfect set of measure zero contained in 
this interval according as 1 << a= 2 or 2 <a. While this implies that 
is singular if 2 <a it does not imply that oa(x) is absolutely continuous if 
a<2. In fact it has recently * been shown that there exist certain algebraic 
irrationalities a << 2 for which L(u,oa) does not tend to zero with 1/w and 
so oq cannot be absolutely continuous. (It was conjectured, loc. cit.*, that such 
values of a are clustering at a=1-+ 0 which would imply that they lie dense 
in the interval 1 << a < 2). On the other hand it is known * that those a < 2 


* Received July 30, 1939. 

1B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta fune- 
tion,” Transactions of the American Mathematical Society, vol. 38 (1935), 48-88, 
particularly Theorem 11. 

*R. Kershner and A. Wintner, “On symmetric Bernoulli convolutions,”’ American 
Journal of Mathematics, vol. 57 (1935), 541-548. 

%P, Erdés, “On a family of symmetric Bernoulli convolutions,” American Journal 
of Mathematics, vol. 61 (1939), 974-976. 

* A. Wintner, “On convergent Poisson convolutions,” American Journal of Mathe- 
matics, vol. 57 (1935), 827-838. 


180 


| 
| 
| 


BERNOULLI CONVOLUTIONS. 181 


for which oq is absolutely continuous are certainly clustering at a=1-+ 0, 
since if a= 2'/", where m is a positive integer, then o@ has a continuous 
derivative of order m — 1. 

The object of the present paper is to show that the successive smoothing 
of o, can be considered as the general case when a—1-+ 0. In fact it will 
be shown that there exists, for every positive integer m, a positive 7(m) such 
that the set of those points a of the interval 1 << a<1-+7(m) for which oa 
does not possess a continuous derivative of order m—1 is a set of measure 
zero. To this end it is sufficient to prove that there exists, for every positive 
integer m, a positive 6(m) such that the set of those points a of the interval 
1<a<1-+8(m) for which 


(2) L(u, oa) = 0(|u|-™), u—> 0, 


does not hold is a set of measure zero. 


Let ¢:,¢2,° be N positive integers which satisfy the following 
conditions : 
(i) 
(ii) 
(iii) Bey, (t—1,2,---,N—1); 
(iv) there exists an such that 22< <2 and | | < 2, 


(¢=1,2,---,N—1). 


LemMA 1. There exist two positive absolute constants y:,-y2 such that 
if M is any fixed number > y2, there are less than | M*/*] different sequences 
C1, *,¢y satisfying the requirements (7)-(iv), the inequalily cy = M, 
and the condition that the number of those indices (v= 1,2,°°+,N) wh ich 
> Yo ts less than y, log M. 


salisfy | Cis, — 


| 


Proof. Suppose that | —- |S Yo and | — |S Yo for 


a fixed 7 Then 
] 


Ci 


hence 
| ci 10 


by (ili). Consequently, since | | < by assumption, 


| Ci | 10 10 2 
and so ¢i,. is uniquely determined as the nearest integer? to C*i/Cé. 

°>The above considerations are suggested by the investigations of Ch. Pisot, “La 
répartition modulo un et les nombres algébriques,” Annali d. R. Se. Norm. Sup, di Pisa, 
ser. IT, vol. VII, p. 238. 


182 PAUL ERDOS. 


Consequently if , denote all those among the indices which 
satisfy the inequality | ci,, -— ac; | > %o then all indices ¢ which are not of 
the form i, + 1 or i, + 2 for some r = 1, 2,---,1, are such that c; is uniquely 
determined by c;-; and ci-2. On the other hand, even if 7 is of the form 1, + 1 
or tr + 2, so that cj is not uniquely determined by cj-, and c¢j-2, then there 
are, by (iv), (or (i)), at most 4 choices for c; after cj-, has been determined. 
Hence there are at most 4*! different sequences which have a 
given set of exceptional indices %,%2,° 

Finally (ii) and (iv) together with the assumption ay = M clearly imply 
that N < 5 log M for sufficiently large M, say for M>-y2. Since the number 
of exceptional indices %4,%2,° - *,% is less than y; log M, by the hypothesis of 
Lemma 1, it is seen that the number of distinct possible choices for a set of 


exceptional indices cannot exceed 


om, ( [5 log M] 
( 0 a 1 + + [y: log M] 
and is therefore less than M’/* if y, is chosen sufficiently small. Since it was 
shown above that there are at most 4°! sequences ¢,,¢2,° - +, ¢y with a given 
set of exceptional indices, it follows that the number of distinct sequences 
C1, C2,* * *, ¢y which satisfy the requirements of Lemma 1 for a fixed M >» 


is less than 
421 M?/8 - 4271 log M M*/4 


if y: is sufficiently small. This completes the proof of Lemma 1. 

If a, A are positive numbers let Ay = Ax(a,A) and «% = «&(a,2r) be defined, 
for k =1,2,---, by placing 
(3) dak = Ay + &, A; integer, 

Lemma 2. There exists an absolute constant y3, which shall be chosen 


to be > y2, such that if M has a fixed value greater than ys, then the measure 
of the set T of those values a in the interval 


(4) 

for which there exists in the interval 

(5) 1<A<2 

aX=X(a) such that the inequalities 

(6.1) <M; (6.2) «(a,A)] > %o 


hold for at most 4y; log M distinct values of k, is less than M+. It is under- 


Sa 


an 


\ 
t] 
di 
ar 
the 
ren 
val 
of 7 
But 
| 


BERNOULLI CONVOLUTIONS. 183 


stood that «& —=«(a,) 1s defined as in (3), and that y:,y2 are the absolute 
constants occurring in Lemma 1. 


Proof. Suppose, if possible, that Lemma 2 is false. Then there exist at 
least [M?*/*] values of a in (4), say 


lj, (j= 1,2,- 

which are in T and which are separated by [/*/*] —1 intervals each of which 
has a length not less than M-*/*; so that 
(7) | a; —a, | = M-3/4, 
Since a; is in T, there exists a A =A(a;) in (5) such that 

< 
holds for all but dy: log M values of & satisfying 

aj*A(aj) < M, 
where, according to (3) 
(8) (aj) = Ax (aj, A(a;)) + = Ar? + say. 
It will be shown that 


(1) The finite sequence of integers A;‘? belonging to a fixed 7 


(=1,2,:--,[M*”*]) satisfies the hypotheses of Lemma 1 if this sequence 
of integers is identified with the sequence of integers ¢,, ¢2,° - +, cy occurring 


there; and that 

(II) The sequences A;‘/’ corresponding to different values of j are 
distinct. Since there are [M/*/*] such sequences this will contradict Lemma 1 
and so complete the proof of Lemma 2. 

In order to prove (I) notice first that (i), (ji), (iii) are obviously 
satisfied for c; = A;‘’. Furthermore, by (8) 


AD +e + ei) 


and so, by (3) and (4) 


| A = | ajes 


so that (iv) is also satisfied, with «aj. The hypothesis (6.1) assures that 
the assumption cy = M of Lemma 1 is satisfied. In order to verify the 
remaining assumption of Lemma 1 recall that there are at most $y, log M 
Values of k satisfying (6.1), (6.2). Thus there are at most y, log M values 
of i such that (6.1), (6.2) are satisfied either for k =1 or for k==1-+1. 
But if i has a value distinct from one of these +, log M values, so that 


| es? | << and 
then, by (4), 


184 PAUL ERDOS. 
| A | = | el? 
Thus there are at most y; log M indices i for which 
| A — | > 
This completes the proof of (1). 
In order to prove (II), suppose, if possible, that (II) is false. Then 
there exists a pair of distinct indices j and k such that 
=A, 
for all 1=1,2,---,N. Thus, by (3), 
(9) | (a) — a;'d(a;)| 2 
holds, for all / such that a,'A(ai.) S M. In particular (9) holds if 1 is an 
index for which 


1 1 
(10) 4 M ye! > 


Now it may be assumed that a, > a; so that, by (7), a 2a; + M-*/*. Then 
and so, by (9), 
Zz — 2) (aj + M**) = aj"A(a;) 
+ aj"A(a;) M-*/4 — 2(a; + M-*"*), 
Hence, by (5) and (10), 
*A (xe) = + —2— 2 (a; + M**) a;"A(a;) +3 


if M is sufficiently large, say WM > y;. Thus 
| (ay) | 3. 


This contradicts (9) (since by (10) ax!*!A(ax) < Mf) where one could write 
for /. This contradiction proves (II). 
The proof of Lemma 2 is now complete. 


LemMA 3. There exists, on the interval (4) a zero set Z which has the 
following property: if ais a point of (4) not contained in Z then there is 4 
positive B = B(a) such that if M is any fired number larger than B and if d 
is any number in (5), then there are at least $y, log M values of k which 
satisfy both conditions (6.1), (6.2). 

Proof. For any positive integer h let T, denote the set of points a on 
the interval (4) such that (6.1), (6.2) hold (for some A=A(a) in (4)) 


for less than 4y, log M values of & if M = 2". Then, by Lemma 2, 


meas Ty, << 24" if 2" > ys. 


| 
t 
| 
a 
4 
B 
T 
| 


BERNOULLI CONVOLUTIONS. 185 


Thus if Ty, denotes for any fixed p > ys the a-set 
(11) & IT then measly < 4yp7. 


It is clear from the definition of T, that if a is not in Ty and if M > p, then, 
even if M is not of the form 2” for some h, there are still at. least ty: log M 
values of k satisfying (6.1), (6.2) for any value of A in (5). Thus if a is 
not in Ty then there is a B = B(a) satisfying the requirements of Lemma 38; 
in fact one can choose 8B =p. Then the set of points a in (4) such that there 
does not exist a 8 = B(a) satisfying the requirements of Lemma 3 is con- 
tained in Ty for every positive wp. Thus by (11), 7 is a zero set. This 
completes the proof of Lemma 3. 


LeMMA 4. For every q > 0 there exists a p=p(q) > 1 and a zero set 
Z=Z, of a-values contained in the interval 


(12) 1<a<p(q) 


with the following properties: if a is a point of (12) not contained in Zq 
then there exists an a= a(a) > 0 such that if M is any fired number greater 
than a, and if A is any point of the interval (5), then there are at least q log M 
values of k satisfying (6.1), (6.2). 

Proof. Let a be a point in the interval 1 <a < 2# such that no integral 
power of @ is a point of the zero set Z occurring in Lemma 3. Let pi, po,°* + 5 pr 
he those prime numbers such that 

Now if 2 is such that a = 2 then, by the elementary inequalities of Chebyshev, 


there are two absolute constants ys, y; such that 


Since a) (j =1,2,- - -,7) is in the interval (4) and not a point of 7, there 


are, by Lemma 3, for every A in (5), at least fy: log M values of & satisfying 


(14.1) | AP | <M, (14. 2 | > Yo 

provided M > B(a%). Thus, if max there are at least 
1StSr 

ty: log M values of & satisfying (14.1), (14.2) for each + (=1,2,-°-,7r). 


x log M 


PiPi log 2 


But there are at most values of k such that 


Thus there are at least 


186 PAUL ERDOS. 


log M 


1<i<j<r pip; log 2 


fry, log M — 


values of & satisfying (6.1) and (6.2). Then by (13) the number of values 
k which satisfy (6.1) and (6.2) is not less than 


417; —— ; log M — log M. 


But this expression can be made greater than qlog M if z is chosen suffi- 
ciently large, i.e., if a is chosen sufficiently small, say a <p(q). This com- 
pletes the proof of Lemma 4 since Zz may be defined to be the zero set of 
points a in the interval (12), some integral power of which is a point of Z. 


THEOREM. For every positive integer m, there exists a positive 8=8(m) 
such that the set of points a of the interval 1 <<a<1-+8(n) for which 


L(u, oa) =o0(|u |”), 
does not hold is a set of measure zero. 


Proof. According to (1) 
L(u, oa) = II cos (u/a"), (a>1). 
n=1 
Thus, if w is in the interval << uS 
k 
L(u,oa) < TI cos (a"(u/a*)). 
r=1 
Now let A=u/a* so that 1< A< 2 Then 


k 
L(u,oa) < IL | cos (Aa’) | | cos (Aa’)|. 
r=1 as 


By Lemma 4, with M = u, if a is chosen in the interval (12) and not in Zq 
and if uw >a(a) there are at least ¢ log u factors in this last product which 
are less than cos 2/30 so that 

| L(u, oa) | w>da(a). 
Since, according to Lemma 4, g (> 0) can be chosen arbitrarily this completes 


the proof of the theorem. 


INSTITUTE FOR ADVANCED STUDY. 


— 
| 
| 
0. 
as 
Vé 
fc 
P 
th 
fie 
0 
pr 
| 
of 
Ni 
C0) 
Wit 
tio 
if 
arr 
al 
the 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF 
CHARACTERISTIC ZERO.* 


By Oscar ZaRIsk1. 


Introduction. In an earlier paper (see footnote **) we have derived a 
number of characteristic properties of simple points of an algebraic r-dimen- 
sional variety V,. There the ground field K (field of coefficients, or field of 
constants) was assumed throughout to be algebraically closed. In the present 
paper we generalize our results to any V, defined by a field & of algebraic 
functions over an arbitrary ground field K of characteristic zero. We do not 
assume that K is maximally algebraic in %. 

Our generalization has an immediate application to simple subvarieties 
of V,, of any dimension. This application is given in the last part (V) of 
the paper. An irreducible s-dimensional subvariety V, of V, can be treated 
as a point P, provided we pass to a new ground field K—a suitable trans- 
cendental extension of K in 3—and regard our V; as an (7 — s)-dimensional 
variety Vy. over K. From our definitions it will follow that V, is simple 
for V, if and only if P is simple for V;-s. The properties of the simple point 
P yield corresponding properties of the simple Vs. It is this application 
that should justify (in the eyes of a geometer) our consideration of ground 
fields which are not algebraically closed. 


Let €,,:- +, be the codrdinates of the general point of V, and let 
0 denote the ring K[é,,- - -,& J. An irreducible V, on V, is given by a 


prime s-dimensional ideal p in o. Let & be the quotient ring of Vs 
(S=o0,, a/be% if a,beo0, bO(p)) and let be the prime ideal 
of non units of 3. We define a simple V, by the condition that there exist 
r—s elements in such that Y(m,° = The elements 
yi are referred to as uniformizing parameters along Vz, or of Vz. Our main result 
concerns the characterization of a simple V, and of its uniformizing parameters 
with the aid of the different F”,, of primitive elements in o. In this characteriza- 
tion we start with an arbitrary set of 7 elements ¢,,°--,¢, in o such that 0 is inte- 
grally dependent on K[£,,---,,-]. Let F’. be the different of an element in 0 
if f° --,f- are taken as the independent variables. Just as a matter of 
arrangement of the indices it is permissible to assume that ¢,,:- -,¢, are 
algebraically independent mod p. Let ==0 (mod p) be 
the irreducible congruence which satisfies over &) 
(i= 1,2,---,r—s). We show that if there exists an element w in o such 


* Received September 28, 1939. 
187 


- 


188 OSCAR ZARISKI. 


that F’.,540(p), then V, is simple and the r—s elements fi(€1,° 
are uniformizing parameters of Vs; and conversely. 

An almost immediate consequence of this result is that the quotient ring 
% of a simple Vz, is integrally closed in &. 

The burden of the proofs rests naturally on the case of simple points, 
We consider the residue class field K, of a point P, i.e. the field K, = 0/p. 
This field is a finite algebraic extension of K. Let K* be the least normal 
extension of K which contains K,. Upon extending the ground field K to K*, 
a new variety V*, is obtained, and on V*, the point P splits into a finite 
number of points P*,,- - -,P*,. The most difficult step of the theory is the 
proof that P is simple for V, if and only if the points P*; are simple for V*, 
and if the quotient ring 3 of P contains the relative algebraic closure of K in 
s. With the aid of this result, the various theorems concerning the simple 
point P can be readily deduced from the corresponding theorems concerning 
the points P*;. 

This reduction succeeds because at each point P*; we have a very special 
state of affairs, namely the residue class field at each point P*; coincides with 
the new ground field K*, This is therefore a special case of our problem: 
it is characterized by the condition K,==K. This special case is treated first 
(Part III). Here we pass directly from K to the algebraically closed field 
determined by K. It is shown that this ground field extension does not cause 
any splitting of the point ?. We then use the results already established in 
the case of an algebraically closed ground field. 

The method just outlined necessitates a preliminary study of the splitting 
of prime ideals in 9 under algebraic extensions of the ground field (Part 1). 
We could not take over directly the results established in this connection by 
van der Waerden and Krull, because these authors have only dealt with the 
special case in which K is maximally algebraic in &. 

The systematic study of simple points and of simple subvarieties under- 
taken in this paper is a necessary preliminary to the problem of local 
uniformization on algebraic varieties which we shall treat in a forthcoming 
paper. 

I. Normal ground field extensions. 

1. Let & be a field and let K be a subfield of 3, of characteristic zero. 
The field K shall be referred to as the ground field. We consider a normil 
algebraic extension field K* of K and we wish to show how this extension 6! 
the ground field defines a corresponding extension field of &, which we shall 
denote by 3*, or by K*3. 

Let be the algebraically closed field determined by K and let K’ be the 


| 
| ( 
{ 
] 
( 
b 
0) 
fic 

( 
lin 
he 
su 
if 
| It 
c*, 
the 
OVE 
ma 
| if 
Lit 

(hi 
of K 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 189 


relative algebraic closure of K in 3%, i.e. the field consisting of all those 
elements of & which are algebraic over K. The fields K’ and K* can be 
imbedded in 2. This imbedding is defined to within relative automorphisms 
of K’ and K* over K, but since K* is a normal extension of K, the intersection 
of K’ and K* is a subfield of K’ which is independent: of the imbedding. Let 
this subfield be denoted by A. 

The elements of shall be the formal finite sums = a*,€, ++ 
a*;C K*, & C 3%, h-arbitrary. Addition, subtraction and multiplication are 
defined formally in an obvious fashion. We need a rule for identifying two 
formal sums, and for this it is sufficient to give a rule for identifying a formal 
sum with the zero element of 3*. Let &* =a*,é,+---+ and 


let +, b*, (b*; C K*) be an independent A-basis of the algebraic 
extension field A(a*,,- - -,a*,) of A. If we substitute formally into the sum 
a*,€, ++ the expressions of a*;,- - -,a*, in terms of b*,,° b*n 


(linear forms with coefficients in A), we get an expression of the form 
OF To indicate this substitution we write: 
+ We identify the element &* with the zero element 
of 3*, if and only if 4, +, =0. It is self-evident that this identi- 
fication rule is independent of the choice of the base b*,,- - -,b*,. More 
generally, let c*,,- - -.c*» be a set of elements of K* which are such that: 
(1) they are linearly independent over A; (2) the a*; can be expressed as 


linear forms of the c*; with coefficients in A. The elements c*; need not 


belong to the field A(a*,,- + -,a*i). By condition (2) we get, through formal 

substitution: €* ++ + We assert that &* = 0 if and only 

For the proof, let d*,,- - -,d*, be an independent 

A-basis of A(b*,,- -,b*,, c*,,° -,c*m) and let d*,, - -+ d* vv. 

It is clear that +: + voy and also 
v 

j=l 

then the matrix (/:;;) is or rank n, since b*,,- + -, b*, are linearly independent 


n L 
over A, and moreover wo; = kjinj. Similarly, if c*; = then the 
ja 


m 
matrix (1j;) is of rank m, and we have wo; = S1;i¢;. Hence, if é* = 0, i.e. 
j=l 


=0, then o, =: = 0, and since (li;) is of rank m, 

it follows that £, - 0. Conversely, if £,=- - —=0, then 
n 

—0, ie. S kjinj = 0, i—1,2,: --,v, and since the matrix 
j=l 

(hij) is of rank n, it follows that == 0, i.e. = 0, 


1 ‘ 
We use small Greek letters for elements of = and small Latin letters for elements 
fK. The same letters with an asterisk denote elements of =* and K* respectively. 


| 
| 
€ 
e 
g 
i] 
h 
st 
d 
| 
). 
1e 
0. | 
al | 
if 
| 


190 OSCAR ZARISKI. 


As an immediate consequence we have the following: if a*,,- - -,a*, are 
themselves linearly independent over A, then a*,é, - + =0 if and 
only if =: -—&& 


It is clear that formal addition and multiplication of the elements of 3*, 
considered as formal sums, is consistent with out identification rule. Hence 
is a ring. 

Lemma 1. Let 6 be an element of K* and let f(0) = 69 + a,091 + ---a, 
= 0, a; CA, be the irreducible equation for 6 over A. The polynomial f(z) 
remains irreducible in the polynomial ring 3[x]; in other words: the relative 
degrees [A(0):A], [3(0@): 3%] are the same. 

Proof. Let =a" + +++ om, C 3S, be an irreducible 
factor of f(z) in 3[z]. Let. 0% 6°),---,0™ be the roots of ¢(z). 
Since 6 C K* and since K* is a normal extension of A, all the roots of f(z) 
are in K*. Consequently, o,,° - -,m@K*. Since the w’s are in & and are 
algebraic over K, they must also belong to the field K’. Consequently 
*,@m CA, whence —f(z), q.e.d. 

By means of this Lemma we now show that 3* is an integral domain. 
(i.e. has no zero divisors). Let = 0, &* =a*,é, + a*nén, 
= b* in, +: and let g be the relative degree of the field 
A(a*,,- b*,,- -,b*n) with respect to A. Let 6 be a primitive 
element of this field, satisfying an irreducible equation F'(6) = 0 of degree g, 
with coefficients in A. By our identification rule we have: 


(1) E* + + = $(8), 

Gis Bj x; 
(1’) n* = Bo+ + = (8), 
and by the same rule, the relation ¢(6) -~(@) = 0 implies that the polynomial 
¢(x)- (2) is divisible (in 3[z]) by F(x). By Lemma 1, F(z) is irreducible 
in Hence, either or W(x) is identically zero, i.e. either &* = 0 or 
n* = 0, which shows that %* has no zero divisors. 

It now follows immediately that 3* is a field. In fact, every element 
of 3* is of the form (1), for some @C K*, and, by Lemma 1, 3* contains the 
entire field 

Remark 1. We call attention to the important role which the field A plays 
in the definition of the field 3*. It is this field, rather than the ground field K, 
which really matters in our construction. By definition, A is the largest subfield 
of % which can be imbedded in K*. We would get the same field 3* if we took 
A as ground field instead of K. 

Of particular importance is the special case K = K’ (i. e. K is “ mazimally 
algebraic” in %, or K is algebraically closed in %). In this case we have 
K = A for every normal extension of K. 


W 
r¢ 
al 
al 
M 
or 
Teg 
hay 
| reg} 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 191 


Remark 2. The fields 3 and K* are subfields of &* ? and have at least the 
field K in common. It is not difficult to see that &* is the smallest field having 
this property, i.e. any field I with this property contains 3*. Our hypothesis 
is to the effect that the field T contains two subfields %, and K*, simply iso- 
morphic to & and K* respectively, and, moreover, that the field K, which 
corresponds to K in the isomorphism between K*, and K* is a subfield of 3. 
It is then clear that the intersection of 3, and K*, must be the field A, which 
corresponds to A in the isomorphism % = %,. Using the reasoning of the 
proof of Lemma 1, it is immediately seen that the join (3,,K*,) of the two 
subfields 3, and K*, of [ is abstractly isomorphic to the field }*, and that this 
isomorphism induces the given isomorphisms between ¥, and 3, and between 
K*, and K*. 

2. Let o be an arbitrary subring of &, subject to the only condition: 
KCo. Let 0* = K*o be the extended ring in 3%, i. e. the ring whose elements 
are of the form a*,é, +: + + + @*n€n, a*; C K*, &; Co. Let A’ be the inter- 
section of 9 with A. Since A is an algebraic extension of K and since K C o, 
it follows that A’ is a field. 

THEOREM 1. Jf A’=A, then o*%0—M% for any o-ideal AM. In the 
general case the relation o*% 99 = YM still holds true if WM is prime. 


Proof. Let é=a*,é, +: + a*n€én, C K*, & C OU, be an element of 


oo, and let 6 be a primitive element of A’(a*,,- -,a*,)/A’. Since 
A’ C 9, we can write € in the form: 

(2) E= 0+ 99-10%", 

where 7; C 9% and where g is the relative degree of A(a*,,- --,a@*n) with 


respect to A’. Under the hypothesis that A’ = A, the elements 1, 6,° - - , 6% 
are linearly independent over A, and hence the equation (2) implies that 
€= 0, = = = 0. Hence €==0(2). This shows that 0*& C OW, 
and since C it follows that = Y. 

In the general case and for a prime ideal Y,° we proceed as follows. 
Multiplying (2) by 1,6,: - -, 0% respectively, we get relations of the form: 

*More precisely: 2* contains two subfields abstractly isomorphic to 2 and K* 
respectively, consisting of the elements a*,€ +---+4*,,€,, in which a*,,---,a*,, CK 
oré,---,& CK respectively. 

*The following example illustrates the possibility: if Let 
K be the field of rational numbers, K* =K(V2) and let 2=K*(a#). If we 
regard K as the ground field then the extension K ~K* does not affect 2, i.e. we 
have Let o = Kia, V2], Then 9* = K*[-], o* 9 = and 
o'Yng=o- (x7,2-V2) #9. Here the fields A and A’ coincide with K* and K 
respectively. 


| 

| 

1 | 

kK 

| 


192 OSCAR ZARISK1. 


E= + 499-167", 
£0 == + +- 
where all the 4; are in Y&. Hence | | = 0, where = 0 if 
1 j, and 6;") =1. This equation is of the following form: 


Bie By=0, 
where Bi =0(%), i=1,2,---,g. Hence & =0(%), and since % is prime, 
we conclude, as in the first part of the proof, that = 0(2), q.e.d. 

An ideal %* in o* is said to lie over an ideal M in o if the relation 
= Y is satisfied. It has been proved by Krull * that,over every prime 
ideal p in o there lies at least one prime ideal p* in o*, provided that o* be 
integrally dependent on o (i.e. that each element of o* be integrally dependent 
on elements of 0). This provision is satisfied in our case, since 0* = K*o and 
since K*, as an algebraic extension field of K, is certainly integrally dependent 
on K (KCo). 

We consider a prime o*-ideal p* which lies over p and we denote by K, 
and K*,+ the residue class fields of p and p* respectively, i.e. the quotient 
fields of the residue class rings 0/p and o*/p* respectively. Since p* oo = p, 
K, may be regarded as a subfield of K*,.. Moreover, K and K* may be 
regarded as subfields of Ky and K*,« respectively. 

LeMMA 2. K*ys is the extension field of Ky obtained by the extension 
K — K* of the ground field K; in symbols: K*p«= K*- Ky. 

We observe that K* and Ky are subfields of K*,« having at least the field 
K in common. Hence, by Remark 2 of the preceding section, we have: 
K*,.  K*- K,. On other hand, any element of 0*/p* is of the form 
a*nEm, C K*, Co/p, This shows that the ring 0*/p*, 
and hence also its ates field K*ps, is contained in the field (K*, Kj). 
Hence K*,. = K*Ky, as was asserted. 

8. Unramified character of the maximal o-ideals. We make the fol- 
lowing assumption: 

The field A is a finite extension of K. This assumption is always satisfied 
if, for instance, K’ (the algebraic closure of K in %) is itself a finite ex- 
tension of K. 


Under this assumption we prove the following fundamental theorem : 


‘W. Krull, “Zum Dimensionbegriff der Idealtheorie” (Beitriige zur Arithmetik 
kommutativer Integritiitsbereiche, III), Mathematische Zeitschrift, vol. 42 (1937), 
749. 


| 
a 
e 
0- 
lie 
th 
if 
| i 
| af 
| COY 
| cle, 
ides 
(Be 
8ch 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 193 


THEOREM 2. /f p is a maximal o-ideal® then o*p is the intersection of 
the prime o*-ideals which lie over p. 

For polynomial rings 0 = K[a,,- - -,2n] this theorem is due to van der 
Waerden.® It appears as a special case of a generalized discriminant theorem 
proved by Krull for any pair of integral domains 0, 0* (o*-integrally de- 
pendent on 0) under the hypothesis that 0 is integrally closed in its quotient 
field.? If we assume, as it is permissible to do, that is the quotient field 
of o, then Krull’s hypothesis in our special case implies that K’ C 0, whence 
A= A’. The special case when o* is obtained from o by a separable extension 
of the ground field has been treated separately by Krull in his report “ Ideal- 
theorie” (p. 40). However, also this treatment is based on the tacit assump- 
tion that the fields A and A’ coincide. Namely, under this assumption it is 
permissible to take A as ground field, since A—A’ Co, i.e. we may put 
A=K, and then our assumption A = A’ becomes: K* = K. It follows 
then, by Lemma 1, that the Galois groups of 3*/% and of K*/K coincide 
(i.e. every relative automorphism of K* over K can be extended to a relative 
automorphism &* over 3%; note that &* is at any rate a normal extension of 
x). One defines then in a natural fashion the concepts of conjugate ideals 
and of invariant ideals in o*. The proof by van der Waerden and its gen- 
eralization by Krull are then applicable, leading to the following theorem: 


THEOREM 2’ (van der Waerden-Krull). Jf A=A’, and if K* is a 
separable extension of A, then each invariant o*-ideal U* is the extended 
ideal of its contracted ideal in 0 : A* —=o*- (U*a0), and for each prime 
o-ideal p ® it is true that o*p is the intersection of the prime o*-ideals which 
le over p. 

We shall make use of Theorem 2’ in order to prove our more general 
theorem for maximal ideals. 

Let A be the least normal extension of K which contains the field A, 
i.e, A is the join of A and of its conjugate fields over K. By our assumption, 
A is a finite extension of K. We introduce the intermediate field } = A3, 
a finite algebraic extension of %, and the intermediate ring 6 = Ao, so that 
o*. The ground field extension K— K* is thus de- 
composed into two successive normal extensions: K— A, A—>K*. We have 
clearly the relations: 3* = K*3, 0* = K*o. 


*An ideal is maximal (or divisorless) if it is not properly contained in any other 
ideal, different from the unit ideal. 

°B. L. van der Waerden, “ Eine Verallgemeinerung des Bezoutschen Theorems,” ‘§ 5, 
Mathematische A nnalen, vol. 99 (1928). 

*W. Krull, “ Der allgemeine Discriminantensatz. Unverzweigte Ringerweiterungen ” 
(Beitrige zur Arithmetik kommutativer Integrititsbereiche, VI), Mathematische Zeit- 
schrift, vol. 45 (1939). 

* Not necessarily maximal as in Theorem 2. 


13 


| 


194 OSCAR ZARISKI. 


We assert that the relative algebraic closure A’ of A in & is the field AK’. 
To show this, we first observe that A—A«K’, and hence, by Lemma 1, 
(A: A] =[3:3]—=g. Let @ be an element of which is algebraic over A, 
hence also algebraic over K’. The relative degree [K’(%):K’] cannot be 
greater than the relative degree :%], and since [3(@) : 3] S [3:3] 
it follows that [K’(a):K’] = g. This last inequality holds true for any 
element « in A’, and consequently [A’:K’] Sg. On the other hand, 4’ 
contains the field AK’, and we have [AK’: K’] = [A: A] =g, in view of the 
relation A—AoK’ and of Lemma 1. Hence necessarily A’ = AK’, as was 
asserted. 

We now prove the relation: 


(3) A= K* AK’. 


Let a* be an element of K* AK’. Since A= K* 9K’, we have [K’(a*) : K’] 
— [A(a*):A]. Now we have just proved that [AK’: K’] =g. Since a* C AK’, 
we conclude that [A(a*):A]<g, for any element a* in K*o AK’. Hence 
this last field is of relative degree = g over A. Since on the other hand this 
field contains A, and since [A: A] =g, the relation (3) is established. 

The relation (3) says that A is the intersection of K* with the algebraic 
closure of A in 3. The ground field extension A— K* therefore satisfies the 
condition of Theorem 2’. We therefore know that every prime ideal p in 0 
is the intersection of the prime o*-ideals which lie over p. Let us assume that 
Theorem 2 has already been proved for the ground field extension K > A and, 
moreover, let us assume that there is only a finite number of prime 0-ideals 


which lie over a given maximal prime o-ideal p. We will have then: 


op = [P1, Pe, Dm]. 


The ideals p; are also maximal in 0,’ hence are two by two free from common 
divisors. Therefore their intersection coincides with their product: 


(4) op = Bus. 
By Theorem 2’ we have 
where p*;;90—= pj. Since (pi, p;)— 0, if i4j, we have also (0*;, 0*$;)=0". 
Hence the product of the ideals o*p; coincides with their intersection, and 


therefore, by (4), 


*Since » is maximal, the ring g/p is a field. The integral domain 9/p; is in- 
tegrally dependent on its subfield g/p and hence is also a field. Consequently p; is 
maximal. 


t 
( 
I 
is 
hi 
| ( 
Li 
E 
wl 
Si 
rig 
ho 
of 
lin 
Co 
bel 
the 
ee | the 
j 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 195 


which proves Theorem 2. 

Thus, to complete the proof of Theorem 2, we have only to prove it for 
arbitrary finile ground field extension K— A, and we have also to show that 
the number of prime 0-ideals which lie over p is finite. This we shall do in 
the following section. 

4. Merely as a matter of notations, we may identify A’ with K, since 
A’C 9. Let p be a maximal o-ideal and let Kp (= 0/p) be the residue class 
field of p. Let A= whence K 4, SA, and let A, = K(#), where 
# is a primitive element of A, over K. Let f(9) = 0 be the irreducible equa- 
tion, say of degree m, which @ satisfies over K. Since #C K, and K,= o/p 
(by hypothesis: p is maximal!), there must exist in o an element w such that 
(5) f(w) =0(p). 


If p is a prime 0-ideal which lies over p, then 


(5’) f(o) =0(p). 
Since A is normal over K and since one root, ) = #,, of the polynomial f(«) 


is in A, all its roots are in A. whence also in 0. Hence, by (5’), we must 


have o=0;(D), where #; is one of the roots of f(x). Let, say 
o=0,(p). We assert thal 
(6) b= (op, —V,). 


Let 6 be a primitive element of A over K(#,), and let [A:K(0,)] =n. 
Every element & of 0 can be written in the form: 
where 
hi = Bio aij Co. 


Since o=%,($), we have: -+ The 
right-hand side of this congruence is an element of o. Consequently, in the 
homomorphism 0 =0/) the elements %, %,°°*,%n-, are mapped upon elements 
of Ky (= 0/p). Since K(#,) = Ay—K, A, the elements 1, 6,- - -,6"1 are 
linearly independent not only over K(#,), but also over Kp (Lemma 1). 
Consequently z cannot belong to p, unless all the elements @, @,° °°, @n-1 
belong to We have: &=ai9 + aio + 


By K,o4 is meant the intersection of 4 (normal finite extension of K) with 
the relative algebraic closure K’, of K in K,y> in the same sense as A was defined by 
the relation: A = K* aK’. See Section 1. 


= 


196 OSCAR ZARISKI. 


and if %==0(p), then ajo + + since p= po, 
This shows that if % == 0(p), then 4; and consequently also 
& = 0(dp,«—9#,), which proves the relation (6). 

From (6) it follows already that the number of prime 0-ideals which lie 
over p is finite, since it cannot be greater than m{=[A,:K]). Let these 
prime ideals be §,,° - -, Px, and let 


(6) Di = (op, o— hsm). 


Since the p; are also maximal, we have 
h 
(7) [Pi ° Pa] = “Ba = (0p, IT (o—%)). 


™ 
Let €= J] (w»—*) and let us consider the ideal (op,&). We assert that 


w=h+1 
it is the unit ideal. Namely, in the contrary case let p be a prime ideal 


divisor of (dp,@). Since pC f and p is maximal, f must lie over p. Hence 
b must be one of the ideals §,,- - -, fn, say Now since €=0(f), 
one of the factors i=h+1,---,m, must belong to say 
o — =0(p,). Hence 3, and this is impossible, since 
0, ~ Vn, and since 0; — 9p,, is an element of the subfield A of 0. 


h 
It is therefore proved that (op,é)— 0. Consequently [J] (o—¥#;)=0(op), 
i=1 


h 
since (ow —0;) =f(w) =0(p). Comparing with (7) we find: 


as was asserted. 


5. It can be shown by examples that Theorem 2 is not generally true 
for non-maximal ideals."1_ For arbitrary prime o-ideals some weaker result 


11 Let K be the field of rational numbers, and let 2 = K (v2) (7, y), where 
are independent variables. We put K* = K (v2), o = y, 2], where = V2. ay. 
We have =* = and = K(v2)[a#,y]. Let p=o- (y?—2,2—22). Ob- 
serving that every element of g can be put in the form f(a,y) +2-g(a#,y), where 
f(e,y), 9(@,y) CKIa,y], it is a straightforward matter to verify that p is prime. 
It is not maximal, since it is contained in the prime ideal g(a,z,y?—2). We have 
o*p =0* 2, V2. V2) = [p*, p*,], where 

*=*(y—V2), p*, (2,9 + V2). 
The ideal j)* lies over yy. In fact, any element f(«#,y) + 29(#,y), reduced modulo p, 
gives a residue of the form A(a#) + yB(«#) (since y? = 2(p) and z= 2x (jy) )- Here 


A(w#) and B(x) are in K[e#]. Should this residue belong to p*, it is necessary that 
A(x) +V2-B(a) be identically zero. Hence A(x) + yB(a) is also identically zero, 
and this shows that p*ag=j. However, the ideal p*, lies over the prime ideal 
o- (#,%,y?—2) which is a proper divisor of p- It is remarkable that in this example 
possesses even an isolated component p*, different from j*, since p* 0(p",)- 


| 
| 
| 
| | SE 
al 
j 
j 
a" 
wl 
7 
Th 
a* 
Sin 


— 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZEKU. 197 


can be established by the usual artifice of quotient rings. Let p be an arbitrary 
prime o-ideal and let be the quotient ring of Let = 
It is well known that the prime ideals in §§ correspond in one to one fashion 
to p and the prime o-ideals which are contained in p. If p, and 9B, are 
corresponding prime ideals in 0 and respectively, then 8B, = = 
The prime ideal $$} = J: p is maximal in 3. By Theorem 2 we have therefore: 
where $*,, 8*.,- - - are the prime ideals in §* which lie over §. 

Similarly, it can be shown in a simple manner that the prime ideals in 
%* correspond, one to one, to the prime ideals in 9* which lie over p or over 
prime multiples of p. The correspondence is again the one of contracted and 
extended ideals..* Let $*; 0* = p*; and put 


[p*,, 
We have evidently: = §*m*, = [%*,, B*.,: -]. Let us assume that 


Hilbert’s basis theorem holds in 0*. From the relation 3*m* = 3*m*, follows 
that for every a* C m*, there exists an element @ in o but not in p, such that 
Co m*. By Hilbert’s basis theorem there exists then an element in 0, 
not in p, such that Bm*, =O(m*). This shows that p*,, p*.,- + + are isolated 
components of m*. Since o*peo—yp, by Theorem 1, it follows that the 


decomposition of o*p into primary components is of the form 


0 p= [p 29 q 29 
where the prime ideals p’*,, p’*2,- to which belong, lie over 


proper prime divisors of p. 


6. The following theorem, which we shall have occasion to use in the 
sequel, gives a sufficient condition that o*p be prime, where p is now an 
arbitrary prime o-ideal, maximal or not. 


Op Consists of all quotients a,8 C 
The elements of are all of the form a*/a, a* C 9*, aC 9g, 0(p). Let 


p" be a prime g*-ideal which lies over a prime multiple of p, and let B* = X*p*. Let 
a*/a, B*/B be two elements in Whose product is in 93*. Then a*p*/ap = y*/y, 


where y* = O(p*), and therefore ya*B* = 0(p*). Since p* = O(p), it follows that 


This shows that is prime. Let a* a* = p* = 0(p*). Then 
This shows that 
Pa o* = p* 


If p*29=40(f), then let a be an element of g which is in p* but not in p. 
Since a is a unit in %*, it follows that S"p" = &’. 

If 9§* is an arbitrary prime ideal in 3", and if p* =9B* ao", then any element 
a*/ain Q* is such that a* is in p*, since a is a unit in S*: This shows that 3* = S"p*. 


o*p = m* 

YA0(p*), and hence either a* or is in i.e. either a*/a or B*/B is in 


198 OSCAR ZARISKI. 


THEOREM 3. A sufficient condition that o*p be prime is that A’ be the 
intersection of the fields Ky and K*. 


Proof. Wet p* be a prime o*-ideal over p. Every element «* of o* can be 
written in the form: a* = a + - where a; Co and @ is 
an element of K*, of degree g over A’. If a* —0(p*), then passing to the 
residue field we find the relation: %.+ = 0, where 
aC Ky. Now if K, = A’, then, by Lemma 1, the elements 1, - , 677 
are linearly independent over Ky. Hence %; = 0, i.e. @ ==0(p). This shows 
that @* is contained in o*p, consequently o*p = p*, q. e. d. 

Let K, be the largest subfield of K, which is algebraic over K, i.e. Ky is 
the relative algebraic closure of K in Ky. Let K* be the least normal extension 
of K which contains Ky and let 0* = K*o. If p* is any prime o*-ideal over p, 
then, by a result proved in Section 2, we have: K*pe = K*- Ky. In view of our 
choice of K*, it follows that this field is algebraically closed in K*y». Hence 
if K*, is any normal extension of the new ground field K*, the condition of 
Theorem 3 is satisfied and p* will remain prime when we pass from o* to the 
ring K*,o*. This is true, in particular, if we pass from K* to the algebraically 
closed field determined by K. In other words: the extension K — K* causes 


the maximal splitting of p into prime ‘deals. 


II. Algebraic varieties over arbitrary ground fields. 


7. Let & be a field of algebraic functions of r independent variables, 
over an arbitrary ground field K of characteristic zero..4 We do not assume 
that K is algebraically closed in %. Let én be a set of generators of 3, 
i.e. K(é,- -,é), and let o—K[&,- - -,& ] be the ring consisting of 
those elements of = which can be expressed as polynomials in &,° °°, &. 
With the elements &; we associate an irreducible algebraic r-dimensional variely 
V, whose general point has codrdinates €,,- - -,é.. A point P of V, shall be 
associated with a prime zero-dimensional ideal p, in 90. The geometric terms: 
“variety,” “ coordinates,” “ point,” are so far purely formal and conventional 
expressions. ‘To confer upon these terms a geometric reality it is necessary to 
imbed our V, in an affine n-dimensional space S,4 over some field A.1° The 
field A may be either K itself or an algebraic extension of K. Now the residue 
class field Ky, (= 0/))) of the prime zero-dimensional ideal py may very well 
be a proper extension of K (necessarily algebraic). Hence, in general, there 


14We assume, of course, that = is a finite extension of K (of degree of trans 


cendency 1). 
18 By the symbol S,A4 we mean an affine n-space in which every point has codrdi- 


nates A. 


| 
i] 
( 
i 
ex 
5, 
4 
C01 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 199 


will not exist elements ¢,,- +, ¢n in K such that é; = ci(po). In such a case 
our point P is not represented by a geometric point of S,*. On the other 
hand, if we take for A some normal extension of K, for instance the alge- 
braically closed field determined by K, then the results of the preceding 
sections show that P may be represented in S;,4 by a set of points.'® 

More generally, we associate with a prime s-dimensional ideal ps in 0 
an irreducible algebraic s-dimensional subvariely Vs, of V;. The codrdinates 
of the general point of this Vs are the elements (oe © upon which the 
elements are mapped in the homomorphism 0 0/ps. The residue 
class field K,, (i. e. the quotient field of 0/ps) is the field of rational functions 
on Vs: Kp, = Given two irreducible subvarieties V, and V’o 
of V,, defined by the prime ideals p, and p’o respectively, we say that I’, 
belongs to V’o if 


8. Let Vs be an irreducible s-dimensional subvariety of V, and _ let 
p=, be the corresponding prime s-dimension] o-ideal. We consider the 
quotient ring op. The ideal J-p— is prime and maximal in and 
we have $%o—p.17 The quotient ring Sp is evidently J itself, and the 
residue class field of (= coincides with Ky. 


DEFINITION. Vz is said to be a simple subvariety of Vr if there exist 


r—s elements m,° in such that: 
Elements such as m,° * *,mr-s shali be referred to in the sequel as 


untformizing parameters along Vs, or at V.. 

We shall see later that if V, is simple, then the uniformizing parameters 
mi can already be found in the ring 0. Now if m,° - *,r-s are in o, then (9) 
is equivalent to the condition that p itself occur among the maximal primary 


components of the ideal o- (m,° 

Q/ 

(9’) 0° (m,° = [p,- - -], 

where the right-hand side is a decomposition of 0° (m,° °°, qr-*) into maximal 


primary components. 


“This set is finite since the relative algebraic closure of K in K. is a finite 


extension of K. See the footnote 14 and the considerations at the end of Section 6. 

; Concerning the relationship between the prime ideals in g and in % see Section 
». To that we add that, more generally, there is a (1,1) correspondence between the 
ideal 9f and those g-ideals q which have the property that each maximal primary 
component of q is a multiple of p. If Qf and q are corresponding ideals, then 
If q is an arbitrary ideal in g, then the ideal q differs 
from q only by primary components which are not multiplies of p (such primary 


‘components are missing in the decomposition of 


| 
; 

| 


200 OSCAR ZARISKI. 


Our purpose is to derive from the above definition a number of char- 
acteristic properties of simple subvarieties. The results will be on the main 
generalization of theorems proved by us elsewhere ** for simple points in the 
case of an algebraically closed ground field K. Practically all the rest of this 
paper deals with simple points. Once the results for simple points are 
established, the extension of these results to simple subvarieties of any dimen- 
sion is rapidly achieved by the usual artifice of a transcendental extension 
of the ground field. 

Dealing with a point P of V,, given by a prime zero-dimensional 0-ideal 
p, we shall proceed in the following manner. The residue class field Ky is a 
finite algebraic extension of K. Let K* be the least normal extension of K 
which contains Ky. We take K* as new ground field and we pass to the field 
= K* and to the ring 0* = K*o K*[é,,- --,é,]. Regarded as ele- 


ments of =* the elements é,,- - -,& are the codrdinates of the general point 
of an irreducible variety V*,. Let p*;,- - +, p*a be the prime o*-ideals which 
lie over p and let P*,,- - -, P*, be the corresponding points of V*,. We may 


say that these points P*; correspond to the point P, and that P splits into 
the h points P*; of V*, By Lemma 2 (section 2), the residue class field 
K*,., coincides with K*Ky, and since Ky © K*, it follows that K*)+, = K*. 
Thus on the new variety V*, we now are dealing with points P*; (t= 1, 2,- - -,h) 
which have the property that for each of them the residue class field K*,., 
coincides with the ground field K*, If we now pass from K* to the alge- 
braically closed field determined by K, then each prime ideal p*; remains prime 
(section 6), and it stands to reason that the results valid in the case of 
algebraically closed ground fields '* can therefore be carried over to the points 
P*; of V*,. For this reason we study first the special case in which K, = K. 
When this special case has been settled, the only thing left to do in the general 
case will be to study the finite ground field extension K — K*, where K* is 
the least normal extension of K which contains Ky. 


III. Simple points. Case K,— K. 


9. Let p be a prime zero-dimensional ideal in 9 (= K[é,- - -, &.]) and 
let the corresponding point P of be a simple point, with 9r a 
uniformizing parameters, and such that the residue class field K, (= 0/), since 
p is divisorless) coincides with the ground field K. Let K* be the algebraically 
closed field determined by K, and let 3* = K*3, 0* = K*o = K*[é,,- 


= where § = 0,. As was pointed out in the preceding section, the 


18 “ Some results in the arithmetic theory of algebraic varieties,” American Journal 
of Mathematics, vol. 61 (April, 1939), no. 2, pp. 249-294. 


\ 
i 
( 
a 
it 
| | ti 
| 
A 
pr 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 201 


ideal o*p = p* is prime and lies over p. It is maximal in o*, hence is zero- 
dimensional, and defines a point P* of the variety V*,. 


LEMMA 3. = 0% ps. 


Proof. Since the relation Y* € o*p+ is trivial, in view of the relation 
J S = 
p*°9 =p, we have only to show that o*, is contained in *. Let a*/p* 
be an element of o*p, a*, B* C o*, B*40(p*). Since p* is maximal, the 


ideal (o*p, B*) is the unit ideal, i.e. we have a relation of the form 
(10) 1+ 
Here &* is an element of o*p and hence can be put in the form: 


&* = + + & 16,7", 
where é; =0()) and 6, is an element of K* satisfying an irreducible equation 
of degree g over K. If 6.,- - -,0, are the conjugates of 6, over K and if we 


g 
multiply (10) by [J (1+ &*i), where &*; &6:+--++ & 0:9", we 
get a relation of the form: 


where 7» ==0(p) and B* C 0*. Hence «*/B* = a*B*/(1 +), and since 1+ 
is an element of 0, not in p, it follows that a*/B* belongs to 3*, q. e. d. 
Let $* = — B*p*, where P—p-op, By (9), we have 


and since by the preceding Lemma the quotient ring o*, coincides with 3%, 
it follows that P* is a simple point of V*, and that m,°* +, yr are uni- 
formizing parameters at P*. Since K* is algebraically closed, we are in posi- 
tion to apply the results of our paper.!® 

Since Ky = K, every element wo of & satisfies a congruence of the form: 
o=c(P),cCK. In particular, let & = ci(P), CK. The point 
P is therefore represented by an actual point (¢,,:**,¢n) of the affine 
SnX. We shall assume from now on that P is the origin of coérdinates in 
whence é; = 0($%), i= 1,2,---,n. 

By (9), every element of $ can be put in the form: o = Aim, 
+ Arr, Ai CH. Let A; =ci(B). Then 


(12) wo Cm + Cryr ($7). 


A congruence such as (12) holds true for any element » in $. Since 
= 3*B, it follows from (12) that o=cim + We have 
proved in '* that in this last congruence the coefficients ¢,,- +, ¢, are uniquely 
determined. From this we conclude immediately with the following: 


202 OSCAR ZARISKI. 


THeorEM 4. The coefficients c,,- ¢r im (12) are uniquely determined 
and belong to K. The elements +, yr are linearly independent mod 
over K*, Moreover, we have the following relation: 

(13) 

By a similar argument and observing that 8? = 7°), 

we find that any element » in § satisfies a congruence of the form: 

where the coefficients c,,- - -,c, are the same as in (12). Proceeding in the 
same fashion, we find, more generally, that for any element w in % there exists 
a formal integral power series: 

Yor 
where W; is a form of degree i in m,° * *, nr, with coefficients in K, such that 
(14) o => Wo + Wy ote vn m-arbitrary. 
Here y%—0, if and only if »=0($%). From (14) it follows that 
+: + and we know that im this congruence 
the polynomial + is uniquely determined by Hence, 
if o == 0(%*""*) , then this polynomial must be identically zero, and consequently 
(15) Bera YB", m-arbitrary, 
a generalization of (13). 

The result to the effect that the uniformizing parameters at P are also 
uniformizing parameters at P*, can be inverted. We show namely that if I is 


a simple point and if r elements o,,° + +, or in 0 are uniformizing parameters 
at P*, then they are also uniformizing parameters at P, i.e. 3*(o1,° + *, or) 
= implies J(o1,° +, or) = 8%. Let 

(16) = Cum + Cirgr (P**), 


cij CK. It has been proved (1%, Theorem 1) that the non-vanishing of the 
determinant | ci;| is a necessary and sufficient condition in order that 
* *,@,r be uniformizing parameters at P*. Hence | ci; | A 0, and since 


the ci; are in K we conclude from (16) and (13) that m,- - +, yr satisfy con- 
gruences of the form: 
ni = diyo, +: (1 = 1, 2,- 
dij CK, 


Hence, by (12), every element » in §§ satisfies a congruence of the form: 
+- -f- err ($87), CK. Denote the ideal (w,° or) by 
%. The above relation implies the following relation: 


(17) (2, B2) 


I 
t 
j 
I 
n 
fe 
CC 
a C0 
ide 
j J 
p 
p* 
qui 
th 
| for 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 203 


Since — it follows that has no prime ideal divisors in other 
than $8.7 Consequently % is a primary ideal, with $8 as associated prime 
ideal, and this, in view of (17), implies that 2% coincides with $,?° as was 
asserted. 

We now are in position to prove the following 

THEOREM 5. There exist uniformizing parameters o,,: at P 
which are elements of 0 and are such that 0 is integrally dependent on the 


ring K [o1,- + +,or]. Such parameters are furnished, for instance, by linear 
forms in +, with non special coefficients in K. 

Proof. Since the original uniformizing parameters are poly- 
nomials in é,,° *,&n, and since we have relations of 


n n 
the form: 71: == > 1, 2,---, 7. The r forms > cijé; are linearly 
j=1 j=1 
independent mod ‘8? (Theorem 4). Hence if &: = > eijnj($*), then the n by r 
jal 


matrix (ei;) must be of rank r. If then we put oj; = 4 == 1,2,° 

then for non special constants ui; in K the r-row square matrix (uj;) (e¢jv) 
will be non singular. The elements o;,: + +,o, will then be uniformizing 
parameters at P*, hence also at P. 

In addition, by a well known normalization theorem of I. Noether, for 
non special wi; the ring o will be integrally dependent on K[o,,° - +, or]. 
This completes the proof of the theorem. 


10. We have seen in the preceding section that if the point P is simple 
for V,, then P* is simple for V*,. It can be shown by examples that the 


converse is not generally true.** We prove, however, the following 


THeorEM 6. Under the hypothesis Ky = K, a necessary and sufficient 
condition that P be a simple point of V, is that P* be a simple point of V*, 
and that K be maximally algebraic in & (K algebraically closed in 3). 


Proof. The condition is sufficient. For if K is maximally algebraic in &, 


Let 93, be a prime ideal divisor of 9{. There exists in &* at least one prime 
ideal, say §3*,, which lies over 93, (Krull). Since R*, is a divisor of SO and since 
(= 9*) is maximal, necessarily == whence = 

*Let p be the exponent of i.e. let Pe = 0(Y), Assuming 
1, we multiply (17) by getting = (Y- Pe-?, Pe) = 0(9[), a contradiction, 

**We refer to the example given in the footnote*. Let p=o- (4, v2), 
p'=p*(«#). The point P* is simple, and # is a uniformizing parameter at P*. The 
quotient field K, is obviously the field K. However, P is not a simple point, since 
the ring p/p? is a K-module of rank 2, while, according to Theorem 4, the ring p/p? 
for a simple point must be of rank r. In the present case we have r= 


204 OSCAR ZARISKI. 


then the relation 3 holds true for any ¥-ideal (Theorem 1), 
If, in addition, P* is simple for V*,, then from the proof of Theorem 5 it 
follows that we can find uniformizing parameters ,- - -, or at P* which are 
elements of We will have then the relation: -,or) = Since 
and since, by Theorem 1, 3* or) © 3 oF), 
it follows that (o1,° +, or) =, whence P is a simple point of V,. 

The condition is necessary. Let be uniformizing parameters 
at P and let 6 be an element of & which is algebraically dependent on K. Let 
6 = a/B, a, B C %, and let 


be the power series expansion for B (see preceding section, especially con- 
gruence (14)). The coefficients of these power series are in K. Since «= 68 
and @C K*, the element @ will have the following power series expansion: 
= Op + Since the coefficient of the form must also be 
elements of K, it follows that 6 is an element of K. Hence K is algebraically 
closed in %, q. e. 

Let 9:,°°*,9r be r algebraically independent elements in o such that 0 is 
integrally dependent on the ring K[y,°--, yr]. Let » be an element of o and 
let G(m,°**,r32) be the norm of z—»o with respect to the field K (m,°°* 5%). 
Let moreover 4; =ci(p),c;i C K. In the case of an algebraically closed field 
K we have proved the following (1°, Theorem 4): a@ necessary and sufficient 
condition that P be a simple point and that be unt- 
formizing parameters at P, is that there should exist an element w in 0 such 
that G’o(m,° * *,9r3o) 4O0(p). Using Theorem 6 we are now in position 
to extend this result to the case under consideration (K,=K). 

Assume that P is a simple point and that the elements 4; — ¢1, °° 
are uniformizing parameters at P. The elements 7; — , yr Cr are then 
also uniformizing parameters at P*, and o* is integrally dependent on the ring 
K*[m,°°*,9r]. By the quoted theorem, proved for the algebraically closed 
field K*, there exists in 0* an element w such that F’4(m, , 9r3 40(d*), 
where F(:,°**,7r32) is the norm of z—o with respect to the field 
K*(m,° More specifically we have shown (1%, p. 269) that we may 
put o = 0,€; +°-°+ + Unén, vi C K*, provided the coefficients v; do not satisly 
certain linear relations with coefficients in K*. Hence, we may choose the ti 
in K, and we may therefore assume that w is an element of 0. The relation 
F’,,40(p*) implies at any rate that is irreducible (over 


22 An immediate corollary of Theorem 6 is the following: if K is not masimally 
algebraic in = then the residue class field Ky for any simple point P of V,, is necessarily 
a proper algebraic extension of K. This is a special case of a more general theoret 
(Theorem 9) proved in Section 14. 


| 


- 


j 

t 

V 

I 

i 
4 
ti 
W 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 205 


K*) and that » is a primitive element of 3*/K*(m1,°°-,r). By Theo:em 6, 
K is algebraically closed in 3. Hence, by Lemma 1, the relative degrees 
[3*:K*(m,° are the same. Consequently 
F(m,°**, 9r3 2) is also the norm of z—w with respect to the field K(m,°--, nr), 
and since F’,,40(p*) implies F”, 40(p), it is thus proved that our condition 
is necessary. 

Conversely, assume #O0(p), and let 
be the norm of z—w with respect to the field K*(m,°--+,yr). Clearly F 
either coincides with G or is a proper factor of G, according as the relative 


degree K*(m:,° nr) ] coincides with or is less than the relative degree 
r)]. In either case the relation G’, 40()) implies the rela- 
tion F’,, 0(p*), and hence P* is a simple point, with — 4r— ¢r 


as uniformizing parameters. To prove that P is a simple point of V,, we 
have only to show, according to Theorem 6, that K is maximally algebraic 
in 3. Let K’ be the relative algebraic closure of K in & and let [K’: K] = g. 
Let 6, be a primitive element of K’ over K, so that K’ = K(@,). Since the 
relative degrees K’(m,° yr)] and [3*: K*(m,- are the same 
(Lemma 1), +, 9r32) is the also norm of z—w with respect to the 
field K’(,° >, yr). Hence is a polynomial in m,° * and 6;, with 
coefficients in K: = F(m,° 2301). If are the conjugates 
of 6, over K, then 


g 
Let w= c(p),c CK (since = K). If we reduce the equation 
nr;;0,) =0 modulo p*, we get =0. Hence also 
F(¢:,° +, 0, tm 1,2,- 9, i.e. 
(18’) F(m,° nr; @;0;) =0(p*), (1 = 1, 2,- 


Now if g were greater than 1, then it would follow from (18) and (18’) that 
=0(p*), i.e. =0(p), since G’, C 0, a contradiction. Hence g = 1, 
i.e. K’ = K, q.e. d. 


Remark. Let (m,° * 9r) = [P, 415 42," be the decomposition of 
the ideal 0 into maximal primary components (see (9’)), where 
We assume that »,,° °°, 7, are uniformizing parameters at the simple point 
P and that o is integrally dependent on 9rJ]. This last condition 
implies that the above primary components are all zero-dimensional. Let 
»=c(p). For algebraically closed ground fields we have proved ('*, Theorem 
4), that the elements » such that G’,540(p) are characterized by the condi- 
tion: ws4c(p;), i=1,2,:--, where are the prime ideals to 
which q,,q2,- - - belong respectively. It is clear that this result holds true 


206 OSCAR ZARISKI. 


also in the present case where K, = K. It is sufficient to take into account the 


IV. ci points. General case. 


11. Let P be a point of V,, p the corresponding prime zero-dimensional 
ideal in 0. Following the as ion in Section 8, we extend our ground 
field K as follows: we pass from K to the least normal extension K* which 
contains the residue class field K,. We put: &* = K*3, 0* = K*o, 3* = K*} 
where = 0, Since K* is a ‘finite extension of K, there is only a finite 
number of prime o*-ideals which lie over p, say p*,,- - -,p*a, and we have, 
by Theorem 2: 


(19) o*p = [p*,,-- 


We denote by P*; the point of V*, defined by the prime ideal p*;. For 
the residue class field K*y+, we have the relation (Lemma 2, Section 2): 
K* +, = K*- Kp = K*, since Ky & K*. Hence the residue class field at each 
point P*; coincides with the new ground field K*. If then P*; is a simple 
point, we have, as far as V*; is concerned, the situation studied in the preceding 
Part ITI. 

We prove the relation = 0* pe, O*pe,. 


Proof. It is clear that * is contained in each of the quotient rings 
o*«,. Hence to prove the ail relation we have only to show that if 7* is 
an element which belongs to each quotient ring 0*y«,, then 7* C *. We can 
write »* = a*,/B*;, +=1,2,---,h, where a*,, B*;C o*, 
Since the p*; are maximal ideals, we can find, for each 1 = 1, 2,- - -,h, an 
element in o* satisfying the congruences: w*;=1(p*;), o*; =0(p*;), 
71. If we put y* 8* = B*,0*, +--+ + 
then »* = y*/8* and 8* = B*; #0(p*;). We have thus found for 7* a quo- 
tient representation y*/8* in which the neat ae 5* is not in any of the 
ideals p*;, i= 1,2,---,h. But then the ideal (0*p, 0*8*) is the unit ideal, 
and the rest of the proof is the same as that of Lemma 3 (Section 9). 

It now follows immediately that 3* has /: and only h distinct prime zero- 
dimensional ideals which correspond to the ideals 
respectively. Namely, let 3*;—=o%y-, and let B*; be the prime zero-dimensional 


ideal of 3*;: 


(20) — p*; —gpta B*;. 
Then 


3 Let x be a prime zero-dimensional ideal in %*, and let $a 9* = p*. The prime 
* 
ideal p* is zero-dimensional and must coincide with one of the ideals p* prope 


| 
( 
V 
¢ 
| bi 
L 
t 
h 
| 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 207 


The quotient ring the quotient since of = p*i. 


On the other hand we have 3*i, since 9 = —;. Hence 
wrk 


12. The relations (20-22) are true for any point P, simple or not. Now 
we assume that P is a simple point and that m,---,y, are uniformizing 
parameters at P. We have then (m,° Hence, by 
(19) and (21), 9r) and consequently, in view 
of (22), (ms° ar) =B*i. This shows that each of the points P*; 
is a simple point of V*, and that +, are uniformizing parameters at 
pP*;. The following theorem is in a sense the converse: 


THEOREM 7. If P is a simple point of V; and if the elements o4,+ ++ , wr 
of S are untformizing parameters at one of the points P*;, then they are also 
uniformizing parameters at P. 


Proof. For the proof, we first establish the following relation: 
(23) om 


Since 8" == 0($8*;"), we have only to show that any element of B*;™ 
is contained in $8”. Let °°, be uniformizing parameters at P. The 


an arbitrary integer = 0. 


element certainly belongs to 8, and since = (m,° yr), we have: 

a=Aim +: Arr, AiC®. If also belong to %, then we 

can put « in the form: «= > Aijninj. Continuing in this manner we will 
inj 

ultimately get for « an expression of the form: @s(m,° yr), Where 


is a form of degree s in m,° yr Whose coefficients Aj,...i,) 
are elements of , and the following are the only two possibilities: (a) etther 
s=m, or (b) s << mand not all the elemenis A,;, are in B. In the case (a) 
we oe % == 0(38"), as was asserted. We show that the case (b) leads to a 
contradiction. Let us denote by (= (m,° the reduced form 
obtained from ¢s(m,° °°; 9r) by reducing the elements A,;, modulo $*; to 
elements of K*.24 By hypothesis the form ¢,°° is not identically zero in 
Since and since s < m, we have obviously the 
congruence : 


because any element which is not in any one of the ideals p*p: . sD", is a unit in 3": 

Let, say, ag*=p"*, If a*/B is an element of where a* 9*, BC 9, BO(p), 
then a* = 0(*,), since is a unit in Hence 93 . contains at least 
h distinct prime zero-dimensional ideals, namely the ideals 
They are distinct, because =p"; Hence the ideals 


i=1,2,...,h, are the only prime zero-dimensional ideals in %*. 
* We recall that K* coincides with the residue class field of B*;- 


| 


208 OSCAR ZARISKI. 


This congruence is in contradiction with the uniqueness of the polynomial 
satisfying the congruence (14) (Section 9) (with 
replaced by $*;). The uniqueness of this polynomial, for a given element o, 
shows namely that no form of degree s in m,° - *, yr, With coefficients in K*, 
can belong to %*;”"*1. The relation (23) is thus proved. 

Now let o:,: °*,r be elements of o which are uniformizing para- 
meters at one of the points P*;, say at P*, We have then the relation: 
(o1,° or) Let « be any element of $ and let, by (12) 
(Section 9) : 


(24) asc*w, + c* c®, C K*. 
Let, in particular, 
(25) n= ++ + c* wr (B*,2), C K*, 


Since the y*; are uniformizing parameters at P, we have: 
= Aim +° + Airmr; Ay C%. 
Let Ai; = d*i;(B*,). Since the Ai; are in J, the elements d*i; are not only 
in K* but also in Ky. We have: 
The matrix (c*;;) in (25) is non-singular, since m,° °°, yr are also uni- 
formizing parameters at P*,. Comparing with (26) we see that the matrix 
(c*i;) is the inverse of the matrix (d*;;), and consequently, since d*i; C K,, we 
conclude that the c*;; also belong to Ky. Now, let «== + 
The same argument by which the d*;; have been proved to belong to K, shows 
that d*,,- - -,d*, belong to Ky. Since c*; = >» d* ;c*;;, it follows that also 
the coefficients c*; in (24) belong to K,. Consequently there exist in ¥ ele- 
ments A,,:°-,A,r such that A;=c*;(¥*,), and for such elements the 
relation (24) implies the following: 
Since « — A,w, —: - -— Aw, C &, this last congruence, in view of (23), 
implies that: 

Such a congruence holds for any element « in 8. Consequently 

and therefore, by an argument used before (footnotes 

(o1,° = $, 


i.e. °°, @,r are uniformizing parameters at P, q.e. d. 


19, A 


al 


th 
Ww 
th 
m 


| 
( 
| t 
( 
\ 
| 
it 


we 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 209 


13. Let = > wijéj, 1, wij C K*, and let = v*,($*,), 
j=1 


v*;  K*. By Theorem 5, the elements £*;—- v*; are uniformizing para- 


meters at P*,, provided the ui; are “ 


non special,” and moreover, o* is 
integrally dependent on K*[é*,,- - -,é*,]. Since the values of the wi; to be 
avoided are those which satisfy certain algebraic relations, we may choose the 
ui; in K, and therefore we may assume that &*,,- + +,&*, are elements of 0. 
The constants v*,,- --,v*, are algebraic over K. Let fi(v*i) =0 be the 
irreducible equation over K which is satisfied by v*;, and let us consider the 
elements o; = fi(é*i), 7—=1,2,- --,7. These are elements in o and 0 is 


integrally dependent on the ring K[o,,- - -, o,], since €*; is integrally depen- 
dent on K[o;]. Moreover, o;==0($), since fi(€*i) =fi(v*i) (PB) and 
fi(v*;) =0. Let = v*;, v* iz, , v*i,g, be the conjugates of v*; over K. 


We have: oi, fi (é*;) = Il (é*: v* ) Since é*; v* 0(33*, ) 
j=l 


gi 
j~1, the product J[(é*: — v*;;) is a unit in the quotient ring ¥*,. Hence 
j=2 


1 (w1, or) (é r) = $*,, 
since €*,; — v*,,- + +, é*,— v*, are uniformizing parameters at P*,. It fol- 
lows that also w,;,° + *,o, are uniformizing parameters at P*,, and conse- 


quently they are uniformizing parameters also at P (Theorem 7). We have 
thus proved the following 

THEOREM 8. Jf P is a simple point, uniformizing parameters o,,°°* , wr 
at P can be found in such a fashion as to satisfy the conditions: (a) 0; Co; 
(b) 0 is integrally dependent on K[o,,° or]. 

This is an extension of Theorem 5, except for that part of Theorem 5 
which asserts that the uniformizing parameters may be chosen as linear forms 
in the é;. This part of the theorem is not valid, of course, in the general case. 

14. In this and in the following sections we wish to prove the following 
important theorem : 

THEOREM 9. The quotient ring 3(= 0») of a simple point P contains 
the relative algebraic closure of K in &. 

We shall need several lemmas. Let K’ denote, as usual, the relative 
algebraic closure of K in &. 

Lemma 4. K’ is contained in the residue class field Ky. 

Proof. By the assertion K’& Ky we mean the following. We know that 
the residue class field K*,«, at each point P*; (i =1,2,---,h) coincides 
with the ground field K*. Since P*; is a simple point, it follows (Theorem 6) 
that K* is algebraically closed in 3*, whence K’ CG K*, In the homomorphie 
mapping of (= 0*y+,) upon K* (= 0*/p*;), the elements of are mapped 


14 


| 
| 
| 


210 OSCAR ZARISKI. 


upon a set of elements which form a subfield K,“ of K*, simply isomorphic 
to K,. The assertion of the lemma is to the effect that K’OK,, for 
jm -,h. 

The proof is similar to the second part of the proof of Theorem 6. Let 
expansion of B at the powmt P*;, in terms of the uniformizing parameters 
at P. We have B=0($*i?), whence, by (23), B= We 
therefore can write in the form: B = DAi, tem +t =p, 


(4) 
C§. The coefficients of the form are obviously 
the K*-residues of the elements A,i,) mod %*;. Since the A,i) are elements 
of <3, we conclude that the c,i, belong to K,“. For the element @ we will 
have the expansion: «= O%p)-+---, and by the same argument we deduce 
that the coefficients 6c,;, belonging to K,‘”. Since not all the coefficients ¢,;, 
are zero, it follows that @ is an element of K,‘”, as was asserted. 


Lemma 5. If BCX and if +, nr are uniformizing parameters al 
the simple point P, then the power series expansion 


of B at P*; has all its coefficients in K,“. 


Proof.” That is an element of K,“ is trivial, since B= 
and BC %. We therefore use induction. We assume namely, for every ele- 
ment B in %, that the coefficients of Wo, are in K,“, and we 
prove that also the coefficients of ym are in K,. 

Let ¢o = 2 + + Jr =o, be an arbitrary form 

) 


of degree o in m,° yr, whose coefficients cj) are in Ky. Let Aj,...3, 
be an element of & such that Aj,...;,=c;,...;,($*:). If we put 
a= >Aj,...j,m%*** rit, then the expansion of « at P*; is of the form: 
% = do + terms of higher degree. 

Let a= ¢o+ The form ¢o,v dépends in an obvious manner 
on the terms of degree v of the expansion of the various elements A,j;,. By our 
nduction, the coefficients of these terms are in K,‘, if vS m—1. Hence 
the coefficients of $o.1,° * *,;d0+m-1 belong to Ky“. By the same argument 
we can find successively elements %,° %m-2 in such that: 


G 


o+jrl 


where the coefficients of the forms ¢‘9),: - +,‘ are in K,“ and 
o+j+m-1 


+ Q) +t 60) = (7 


o+j o+j 


25 We point out that in the course of the proof of the preceding Lemma we have 
incidentally established the truth of Lemma 5 for the coefficients of the terms of 
lowest degree of the expansion of B. 


h 


| | 
t 
| 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 211 


If we put y= +--+ Gm», then the expansion of y is of the form: 
y = bo + Jorm-1 + terms of higher degree, where go.m-1 is a form of degree 
o-+m—1 in m,**+,r with coefficients in K,. We now take succes- 
sively for the forms ++, of (27). We get then elements 
Vis > Ym-1 Such that y: = gm + terms of degree > m, yj = Wj + 
terms of degree > m, j =2,: + -m—1. Here the coefficients of gm are ele- 
ments of Ky‘. Let Yo =6, C K,™ and let 


The element » has the following expansion at P*;: 


(28) wo = 6, + (Ym — gm) + terms of degree > m. 

Let f(6:) 0 be the irreducible equation, of degree g, which 4, satisfies 
over K, and let 6.,--+,6, be the conjugates of K. We have f(o) = 
(w — 62) (ow and from (28) it follows immediately that: 


f(o) = (Ym — gm) (81) (mod B* i"). 
Now f() is an element of 3 and (%m—gm)f’(@,) is the set of terms of 
lowest degree in the expansion of f(w) at P*;. Hence * the coefficients of 
the form gm) are in Ky. Since 6, C K,™, also f’(6,) belongs 
to K,“, and since the coefficients of gm are in K,, it follows that the 
coefficients of Ym belong to K,‘’, as was asserted. 


15. Our next lemma concerns arbitrary integral domains in which every 
ideal possesses a finite basis (Hilbert’s basis theorem). Let G be such an 
integral domain and let » be a maximal prime ideal in ©. We consider an 


arbitrary ideal 2% in © and its decomposition into maximal primary com- 


ponents. Let +, be the primary components of % whose prime 
ideals Pm are multiples of p. Let be the remaining 
primary components of 2; ’:, p’2,- - -,—— their prime ideals. Thus we have: 


pi =O0(p), (1=1,2,---,m); 
A0(p), (j = 1,2,° °°). 
Let q denote a primary ideal belonging to p and let ACH, q) be the intersection 
of all the ideals (M,q) as q runs through the totality of all primary ideals 
belonging to 
LEMMA 6. q) = [a1 5m). 


Proof. If we assume that the lemma is true for primary ideals , then 
the lemma follows in general. Namely, we have: A (2%, q) S A(qi, q), and 


hence, by your assumption : 


| 
f 


212 OSCAR ZARISKI. 


(29) A (2, q) = [15 Gey ° 


We put = +. qm], - -]. Since p is maximal and 
AO0(p), we have (%,,q)— for any primary ideal q belonging to p. 
Consequently, we can write: %, = (4,9, %.q) =0(%, q), whence: 
From (29) and (29’) our lemma follows. 

Let now %f be a primary ideal, and let $8 be the associated prime ideal. 
We may assume $$ == 0(), because in the contrary case the lemma is trivial. 


We denote by 6 our ideal A (2, q) and we first establish the following relation: 


(30) (dp, —6. 

Since = 0(8), the relation 

(31) (8p, 9) =0(8) 

is trivial. Let 

(32) (dp, q'15 25 ‘| 


be the decomposition of the ideal (ép, 2{) into maximal primary components, 
where we assume that the prime ideal p’; associated with q’; is p, i= 1,2,---, 
and that q is either the unit ideal or belongs to p. Since 2==0(q), we have, 
by definition of 8, that 8==0(q). On the other hand ép=0(q’/i), and since 
p+ 0(p’;) it follows that 6=0(q'/;). Hence 6€ (8p, and (30) follows 
in view of (31). 

By means of the relation (30) the proof of the lemma is readily com- 
pleted. Let d,,- --,dp be a basis for the ideal 6 By (30) we have the 


p 
following set of relations: dj = > pijdj + ai, i=1, 2,- > +, p, where the pij 
jal 
are in and -,dp are elements of Hence 
p 
(33) (8:5 — pis) dj =0(%), (1=1,2,- 
j=l 


where 5;; = 0 or 1, according as j or i=j. The determinant A = |8;; — pii| 
is of the form 1 + p, p=0(p). Hence A40(p), whence a fortiori AA0($). 
Since % is primary, with $8 as associated prime ideal, we conclude from (53) 
that d,,---,dp belong to Hence and since == 0(8), it follows 
that A (A, q) =8=Y, 


16. Now at last we are in position to prove Theorem 9. Let @ be an 
element of K’. Since & is the quotient field of 3, we can write 6 = ~/B, 
a, We may assume =0($%), because otherwise there is nothing to 
prove. Consider one of the points P*,,:--, P*n, say P*,. By Lemma 4 there 
exists an element y,; in such that y,=0($*,). +n 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 213 


be the expansion of y, at P*,, in terms of the uniformizing parameters 
m:’*° °,>%r at P. Here yi is a form of degree 7, and its coefficients are in 
od (Lemma 5). In particular, the coefficients of the linear form W, are 
in Kp’, and therefore we can find an element y’; in 3 whose expansion is of 
the form: =—y," +--+. If we put + then y. = 6+ 
+ terms of degree > 2, where wy." is a quadratic form in m,° + +, 9 with 
coefficients in K,. Let y’2 be an element of S whose expansion is of the 
form: y’,=—y, + terms of degree > 2, and let y; =y2 + y's. Then ys 
is an element of whose expansion at P*, is of the form: ys; = 6+ 
+ terms of degree > 3, and again all the coefficients in this expansion belong 
to Ky (Lemma 5). Continuing in this manner, we can find for each positive 
integer 7 an element yi in .} whose expansion is of the form: yi = 6+ yj 
+ terms of degree >1. Here yi“ is a form of degree i in m,° + +, 9, with 
coefficients in Hence y; =0(*;*), and since « = it follows that 
¢ — yiB Now o@ 
earlier that $8*,'* 3 —=* (congruence (23), Section 12). Hence 

(34) a— = 0(P*), 
Let Q be an arbitrary primary ideal belonging to $ and let p be the exponent 
of Q. From (34), for i= p, we deduce «== 0(B8, whence 


(35) A(B,Q). 
S) 


~ 


yi8 is an element of § and we have proved 


Since 8 is maximal, we may apply Lemma 6, where we put % = 3: B, p= $. 
In applying this lemma we must take into account that every ideal in is a 
multiple of $8, whence the primary components q/;,q’2° of are never 
present in the case under consideration. In other words: in the present case 
our lemma asserts that the intersection of the ideals (8, Q) is the ideal 3: B 
itself. Ifence, in view of (35), we conclude that a C ¥- B, whence a/B, i.e. 0, 
isan element of S. This completes the proof of Theorem 9. 

The following theorem is a generalization of Theorem 6: 

THEOREM 10. Jn order that P be a simple point of Vy it is necessary 
and sufficient that: (1) P*,,- > +,P*, be simple points of V*, and (2) thal 
the quotient ring 3(=0,) of P contain the relative algebraic closure K’ of 
K in &. 

Proof. We have already proved that the conditions are necessary (see 
Section 12 and Theorem 9). We prove that they are sufficient. The uni- 
formizing parameters at the simple point ?*; may be chosen in ¥ (Section 


13). Let m,° >, yr be such uniformizing parameters. We have 


(36) gr) = 


214 OSCAR ZARISKI. 


where 3$*, = 0*+,. We now use condition (2). Since K’ C Ma we can 
apply Theorem 2’ to the ring (put 3, A=A’=K’). Let = 
and let B2,---, Bo,oSh (see (21)), be the conjugates of the prime ideal R, 
under the relative automorphisms of =* over 3. The intersection Bo] 
is an invariant ideal and its contracted ideal in Y is %. Hence, by Theorem 

+, Bo] and consequently o—h, i.e. the prime ideals 
Ba in which lie over B form a complete set of conjugate ideals, 
From (22) and (36) it follows that 3, is a maximal isolated component of 
the ideal 3*(m,°-°-*,yr). Since this last ideal is invariant, it follows that 
all the ideals §;, 41=1,2,---+,h, are maximal isolated components of 
S*(m.° Taking into account the fact that -, are the only 
maximal prime ideals in it follows that 3*(m,°°*, yr) = Bi 
Hence, by Theorem 1, 


mr) = ®, 
and this shows that P is a simple point, q.e. d. 


17. In Section 10 we have extended to the case K, = K the theorem on 
the different G’,, proved in our paper.’® We now propose to prove this theorem 
in the most general case now under consideration. . 

Let P be a point of V,, p—the corresponding prime o-ideal. Let ,° °°, 
be algebraically independent elements of p such that o is integrally dependent on 
the ring K[m,---,r]. Given an element in we denote by 2) 
the norm of with respect to the field K(m,- -, yr). 


THEOREM 11. A necessary and sufficient condition that P be a simple 
point and that m:,° °°, yr be uniformizing parameters at P, is that there exist 
an element such that G’o(m,° #O0(p). 

We first prove that the condition is necessary. If K’ is the relative 
algebraic closure of K in %, then since P is a simple point, K’ f Ky (Theorem 
9, or Lemma 4). Let 

K’] =m, [K’: K] =p, [K*: K’] =g, 
whence [Ky:K] = mp. Since K’C&, the ideal 3*§8 decomposes into at 
most m prime ideals Qi, i.e. we have h S m (see Section 4, where we should 
put =A, A=K™*, whence Apy= Ky, since Ky © K*). On the other 


2° Hence - are also uniformizing parameters at P*,,- - - Without 
the condition K’ C & this is not always true. For instance, in the example given in 
footnote 14 let P be the point given by the ideal g - (2, y? — 2,2). The corresponding points 
p* , P*, on V*,, are given by the prime ideals 9* (2, y — v2), o*(2,y + v2) respectively. 
The elements 7, = y? — 2, 9, =2 + 2a (= V2.a(y + V2) ) are uniformizing parameters 
at P*, but not at P*,. 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 215 


hand, since K’ is algebraically closed in &, the prime ideals of 3*§$ form a set 
of conjugates under the relative automorphisms of &* over %. These auto- 
morphisms are extensions of the relative automorphisms of K* over K’. If we 
now take into account the relation (6), or (6’), of Section 4, we deduce that 
h =m and that 


Let F(y,° denote the norm of z—wo with respect to the field 
K*(4:,° °°, r), Where » is an element of o*. Our theorem is true for each 
of the simple points P*,,- - +, P*m, in view of the fact that at each of these 


points the residue class field coincides with the ground field K*. Thus, dealing 
with the point P*,, we may assert that there exists an element » in o* such that 


With the aid of the Remark at the end of Section 10 we proceed to make a 
judicious choice of the element o. Let 66," be a primitive element of 
K, with respect to K and let == 1,2,---,m, 7=1,2,- be 
the conjugates of 6 over K. We choose the notations in such a fashion that 
6.5) +, Am is a complete set of conjugates with respect to the inter- 
mediate field K’. Let f(@) =0 be the irreducible equation, of degree mp, 
which @ satisfies over K. Since 6C K,’, there exists an element ¢ in o such 
that 


(39) 0,“ (p*,), 
whence 
= 0(p). 
Let 
be the decomposition of the ideal into primary maximal com- 
ponents (see (9’), Section 8), and let p:,° - -,o be prime ideals associated 
with the primary ideals -, qo respectively. The ideals Po are 


zero-dimensional, since 0 is integrally dependent on the ring K[m,- - -, 9r]. 
We show that there exists an element w in o such that 


(40) f(o) =0(p), 
(40’) f(o) F0(pi), *,@). 


<s Ky") is a subfield of K*, simply isomorphic to Ky: and is contained in the 
residue class field of the point a This field has been first introduced in the proof 
of Lemma 4 (Section 14). The fields Ky(™ are conjugate fields 
over K’, not necessarily distinct. 


| 
| 
rs | 


216 OSCAR ZARISKT. 


We put, namely, »o = ¢ + ex, where cC K and = is an element of ) but 
not in pi, += Since o=€(p), the congruence f(w) = 0(p) 
follows from (39’). On the other hand we have: 


(we assume that the leading coeflicient of f(6) is 1). This is a polynomial 
in with coefficients not all =0(p;), i=1,2,---,0, since 7” 0(p;). 
Hence we can find the constant ¢ in K so as to satisfy the relation f(o) 40(y,), 
for all i= 1, 2,---,o. We assert that the element of 0, constructed in this 
fashion, necessarily satisfies (38). Namely, let p*i1, be the prime 
ideals in o* which lie over p;. The ideals p*,,- - -, p*m,p*i; are then the 


prime ideals associated with the primary components of the ideal 0* (7, 


Since p*.,- - +, p*m are the conjugates of p*,, it follows, by (39), that 
(39”) 6, (p*;), 

whence 

(41) wo FO," (p*;), (4m -,m). 
Moreover, by (40’), we have f(o) #O(p*ij), = 1, 2,° whence, in 
particular, 

(41’) o FO, (p*ij), 1,2,- "995 j = 1, 


The relations (39”), (41) and (41’) imply (38), in view of the remark at the 
end of Section 10. 
Since K’ is algebraically closed in &, it follows, by an argument repeatedly 


used before and based on Lemma 1, that the norm of z—wo with respect to 


the field K*(4:,° - -,r) is the same as the norm of z—o with respect to the 
field Therefore the coefficients of 4r,2) are in 
K’. If F,,- -, denote the conjugate polynomials of with respect to K 
and if G(m,° is the norm of z—o wtih respect to K(m,-° 


then obviously 


If we put F(0,--,032) then =0(p*,), since F(m, 
= () and 7; =0(p*,). Consequently ¢(z) is divisible by z— 6, (397). Bul 
(38) implies that is not divisible by (2 Since 
is invariant under the relative automorphism of 3* over & (its coefficient 
being in K’), it follows likewise that ¢(z) is divisible by z— @;“’, but is not 
divisible by (z — )?, 1=1,2,° +, m. 

We can also show that $(z) is not divisible by z—6;"%, for all 7A! 
and i=1,2,:--,m. In fact, if say $(z) was divisible by z— 6,°’, then 
6, would have to be the value of » at some prime ideal p* belonging to the 


1e 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 217 


ideal o* (p*). This ideal p* could not be any 
of the ideals p*,,---,p*m, since o=6;" (p*;), i=1,2,---+,m, and 
A6;. Hence p* would have to be one of the ideals p*j;, p*j2,° 
i=1,2,: --,o, considered above. This, however, would be in contradiction 
with (407). 

The polynomial $(z) therefore factors as follows: 


m 
o(z) =I1(z—@™) -y(z), 

where y(z) and f(z) have no common roots. Let ¢;(z) (= Fj(0,---,0,z)), 
j=1,2,°°:,~—1, be the conjugates of #(z) with respect to K. We will 
have likewise: 

mm 
(2) = IT - y(2), > 
where again ~j(z) and f(z) have no common roots. By (42) we have: 
where A(z) Wi) and f(z) are relatively prime. Since 7:=0()), 
we have: G’,==f'(w) -A(w) +f(o) A’(o) (p), i.e. =f’ (o) (YD), 
since f(w) == (40). Now we observe that f’(o) 40(p), since f(w) =0(p) 
and f is an irreducible polynomial, and we also note that A(w) 40()p), since 
A(z) is not divisible by f(z). Hence G’,40(p), as was asserted. 
18. Continuation of the proof. The condition is sufficient. From (42) 
we deduce in the first place that G’. 40() implies the relations F”, 40(p*;), 
1=1,2,---,h. Hence from the hypothesis G’,A0(») follows at any rate 
that the points P*,,- ++, P*, are simple (by the special case K, = K; see 
Section 10). It remains to prove that K’ C % (Theorem 10). We shall prove 
the following stronger result: % is integrally closed in %. Since K CX and 
K’ is an algebraic extension of K, the property of 3 being integrally closed 
will obviously imply that K’ is a subfield of §. Now to show that ¥ is 
integrally closed in X, we consider the complementary module e *° of the ring 
K[m,° 9,0]. If v denotes the relative degree [%:K(m,° mr) ], then 
it is well known that the elements 


form a module basis for with respect to the ring K[m,° Since 
0(p), it follows therefore that e is contained in X. Since e contains 


** This assertion has been proved in the case of an algebraically closed ground 
field (**, p. 263, footnote 12) and therefore is cbvionsly true for any ground field. 
2°The module @ consists of those elements ¢ of 2 for which the trace 7'(¢.a) is in 


- -,0,], for every element a in K[n,- -,7,,0]. 


| 
e 
e 
1) 
ly 
0 j 
1e 
K 
); 
)) 
ul 
2) | 
nt § 
1 | 
he 


218 OSCAR ZARISKI. 


the integral closure of the ring K[m,- - -, mr] and since o is integrally depen- 
dent on this ring, we conclude that % contains the integral closure 0 of o. 
Let » be the contracted ideal of 8 in 0: P=. We have then 


Sp — 0p. On the other hand 05 0» (since =p), whence 0; 


Consequently xs =0;, and therefore X is integrally closed (since for the 
integrally closed ring 0 it is true that the quotient ring of any prime 0-ideal 
is integrally closed). This completes the proof of Theorem 11. 

As a corollary we have the following 

THEOREM 12. The quotient ring % of a simple point ts integrally closed 
in its quotient field. 

Remark. If P is a simple point, with +, as uniformizing para- 
meters, and if o is integrally dependent on K[m,- - -,r], then the elements 


w of o such that G’,40(p) are characterized by the relations (40), (40’), 


where the irreducible polynomial f(w) must be of degree mp = [K,:K]. This 
follows from the first part of the proof of Theorem 11 (Section 17) and from 
the remark at the end of Section 10. 

In Theorem 11 we have assumed that the elements 1,° - -, 7, are in p. 
We now drop this assumption, and we consider the uniquely determined 
irreducible polynomials f;(z) in K[z] (t= 1, 2,---, 17) such that fi (i: )=0(). 
We can easily prove the following stronger theorem: 

THEOREM 11’. A necessary and sufficient condition that P be a simple 
point and that fi(m),°°°+,fr(qr) be uniformizing parameters at P is that 
there exist an element w in such that G’a(m,° FO(p). 

That the condition is necessary follows almost immediately from Theorem 
11. Namely, let us put £:=fi(yi) and let H(f,---,¢-;2) denote the 
norm of z—w with respect to the field while G(m1, , 9r32) 
denotes, as before, the norm of z —w with respect to the field K(m,- °°, 71). 
Since this last field contains the field K(¢,,- - -,£,-), it is clear that we have 


an identity (in z) of the form: 


where A is a polynomial with coefficients in K. Since G(m:,° - -,9r3o) =9 
we have 

Now, by hypothesis, {:,- - -,- are uniformizing parameters at P, and more- 


over, o depends integrally on the ring K[f:,- - -,&-], since each element 4 
is integrally dependent on K[£i]. Hence, by Theorem 11, we must have 
H’,<40(p), for a suitable element » in 0. The above identity shows then 
that we must also have G’,, 54 0(p), as was asserted. 


| 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 219 


The proof that the condition is sufficient is direct and follows the lines 
of the proof of sufficiency given in the special case of Theorem 11. The rela- 
tion 0(p) implies the relations -,yr3o) 40(p*:), for 
i=1,2,---+,h. Hence P*,,- - -,P*, are simple points and 7,—6,,: -, 
nr — 9, are uniformizing parameters at P*;, where the 0;‘%’ are elements 
of K* and 4; (p*i) (Section 10). Since fj(y;) =O0(p) =0(p*:), 
6; must be a root of f;(z). Since fj;(z) has no repeated roots, it follows 
that fj;(;) differs from »; — 6; by a factor which is a unit in the quotient 
ring of the point P*;. Hence f:(m),° fr(r), ie. €1,° +, are 
also uniformizing parameters at P*; (c= 1,2,---,h). It remains to prove 
that K’ C &, since from this it will follow that P' is a simple point (Theorem 
10) and that £:,- - -,¢, are uniformizing parameters at P (Theorem 7). The 
rest of the proof is the same as before, namely it is shown that 3 is integrally 
closed in 

CoroLttary. If V; is a linear space, 1. e. if 0 is a polynomial ring, then 
every point of Vr is simple. 

In fact, if o = K[é,,---, &-], where é,,:--,&- are algebraically indepen- 
dent elements, then the norm @ of z — o with respect to the field K(é,,- ++, &) 


is itself, whence G’, 1. 


V. Simple subvarieties. 


19. Let V, be an irreducible simple subvariety of V+, of dimension s. 
Let » be the corresponding s-dimensional prime ideal in o and let § = o0,. By 


definition, there exist elements in say m,° * 9r-s, Such that 
Let £:,: - +, be elements of § which are algebraically independent mod p 


(with respect to the ground field K), but otherwise arbitrary. We take as 
new ground field the field Q = - -,£,) and we put 6 =Q-o0, = op. 
Since Q is a pure transcendental extension of K, ) is prime. It is of dimension 
zero, since the é’s are algebraically independent mod p. With Q as new ground 
field, the elements - define an (r—s)-dimensional variety 
which they are the codrdinates of the general point. The ideal p defines a 
point P on V;... The quotient ring § = 05 coincides with %, since the ele- 
ments of Q are units in %. Hence, by (43), 
ie. Pisa simple point of Vr-s. 

Conversely, assume that P is a simple point of Vr». There will exist 
then elements in i.e. in satisfying (43’). The relation (43) 


1 
| 
| 
| 


220 OSCAR ZARISKI. 


is equivalent to (43’). Hence V, is a simple subvariety of V,;. Thus the 
assertions: isa simple subvariety of V,; P isa simple point of 
equivalent. 

THEOREM 13. The quotient ring $(=0,) of a simple subvariely is 
integrally closed in &. 

The theorem is an immediate consequence of Theorem 12 and of the fact 
that J = J — 035. 

THEOREM 14. Jf Vs is a simple subvariety of V,, the uniformizing 
parameters m,° dlong Vs and the elements €,,- +,& algebraically 
independent mod , can be so chosen that they be elements of 0 and that 0 be 
integrally dependent on the ring K[m,° 9r-83 

Proof. Let 0 =0/p=—K[é.,- - -,én], whence 0 is of degree of tran- 
scendency s over K. Subject to a preliminary linear homogeneous transforma- 
tion on €,,° - *,&n, With coefficients u;; in K, we may assume that the following 


conditions are satisfied. 


(a) are integrally dependent on K[é,,- - &]; 

(b) * are integrally dependent on K[é,- -, 

(c) fils: are uniformizing para- 
meters along V,, where fi is an irreducible polynomial in és, with 


coefficients in K. 


The possibility of satisfying conditions (a) and (b) is trivial. .\s to 


condition (c). we observe that by (b) the elements &,- + +, é are algebraically 
independent mod p. We put i= It then follows from 


the proof of Theorem 8 (Section 12) that for “non special” wij, certain poly- 
nomials fi(és.i), 1, 8, with coefficients in Q(—= K(G,, &)) 
will be uniformizing parameters at the point P of a hence also along Vs. 
The values of the ui; to be avoided are those which satisfy certain algebraic 
relations with coefficients in Q. Since K contains infinitely many elements, 
the wi; may be chosen in K. We may assume that the leading coefficient 
of fi; is 1. If we put —fi(é:,° 1,2,° r—s, then 
(=&,°°*,&s) are algebraically independent mod and m,° 


are uniformizing parameters along Vs. Since 7; =0(p), we find, passing 


the ring 0: fi(é,° Essi) = 0. Since f; is irreducible, it follows by 
condition (b) that the coefficients of f; are polynomials in és. Hence 
is integrally dependent on K[é,,-° m1], 1,2,° whence 


if 
prin 
poir 


(sin 


0 
0 
d 
fi 
K 
ov 
th 
Wr 
su 
be 
to 
fie 
oul 
(tl 
alse 
Is 
The 


ALGEBRAIC VARIETIES OVER GROUND FIELDS OF CHARACTERISTIC ZERO. 221 


This completes the proof of the theorem, in view of condition (a). 


20. Let m,° be algebraically independent elements of such that 
is integrally dependent on the ring K[m,- The residual class ring 
0=0/p will depend integrally on the ring K[#:,- - -,#r], and since 5 is of 


degree of transcendency s over K, s elements 4; have to be algebraically inde- 
pendent. Consequently s of the elements 7; are algebraically independent 
mod p. We assume that m,° are algebraically independent mod p. Let 
films’ **s 983 be the irreducible congruence which satisfies over 
K[m,°** ss]. Let moreover, be the norm of z—wo(wCo), 
over the field K(m:,- - -,-). As a generalization of Theorem 11’, we prove 
the following 


THEOREM 15. The existence of an element » in o such that F’a(m,°°°, 
wi%) FO(p) ts a necessary and. sufficient condition in order that Vs be a 
simple subvariely of and that 5985 983 Mr) 
be uniformizing parameters along Vs. 


The theorem is an immediate consequence of Theorem 11’. It is sufficient 
to observe that +, 9r3%) is also the norm of z—w with respect to the 
field *,9r), Where OQ =K(m,° +,7s). Moreover, as was pointed 
out in the preceding section, the subvariety Vs of V; and the point P of Vr-s 
(the elements now play the of -,&s) are both simple or 
not simple at the same time, and that uniformizing parameters along Vs are 
also uniformizing parameters at P, and conversely (see (43) and (43’)). 

An immediate corollary of Theorem 15 is the following: if Vs contains 
asimple point P of V,, then Vs itself is a simple subvariely. In fact, if Po 
is the prime zero-dimensional o-ideal which corresponds to the point P?, then 
v=0(po), and (Theorem 11’) implies the relation G’,, 0(p). 

We can invert this result. We show namely that a simple subvariely Vs 
contains at least one simple point of Vr. Using the notations of Theorem 15, 
if V, is simple, then @’,,40(p), for some » in 0. We can therefore find a 
prime zero-dimensional divisor py of p such that G’,AO0(p).°° If P is the 
pont of V, defined by po, then P is simple (Theorem 11’) and lies on Vs 


(since p == 0( ). 
THE JoHNS HorKINS UNIVERSITY. 


“Tf we pass to the ring o/p, then our assertion is equivalent to the following: 
ifa%0, then there exists a zero-dimensional prime ideal which does not contain a. 
The proof of this assertion is straightforward. 


) 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS.* 


By J. E. Eaton. 


1. Introduction. Grouplike systems with non-unique multiplication 
were first studied by Marty in 1934._ Others who have been interested in such 
systems are Wall, Kuntzmann, Ore, Griffiths, and Krasner.” In 1938 Dresher 
and Ore* undertook an axiomatic investigation of the general properties of 
such systems which they called multigroups. : 

Many of the results of Dresher and Ore were concerned with the relation 
of subsets of the multigroup to the multigroup itself. In this paper we shall 
extend some of these results to algebraic systems with a multivalued associative 
operation multiplication which however need satisfy no quotient law. As in 
multigroups, coset decompositions of the system are possible, and we may 
characterize completely the homomorphisms generated by such decompositions. 
Normal subsets of the system exist which have interesting structure properties. 
In particular for these subsets we may enunciate a Jordan-Holder theorem. 
The theorems derived in this paper in a few instances represent improvements 
in the known theorems of the theory of multigroups. 


2. Definitions. An associative multiplicative system is an algebraic 
system in which there is defined a single binary operation multiplication. We 
shall for brevity throughout this paper refer to such a system as an m-system. 
We shall denote by t an arbitrary m-system and by m, mo,° - - the elements 


* Received April 26, 1939. 
1. Marty, (1) “Sur une généralisation de la notion de groupe,” Huitiéme congrés 
des mathématiciens scandinaves, Stockholm, 1934, pp. 45-49; (2) “ Réle de la notion 
d’hypergroupe dans 1’étude des groupes non abéliens,” Comptes rendus, vol. 201 (Paris, 
1935), pp. 636-638; (3) “Sur les groupes et hypergroupes attachés a une fraction 
rationelle,”’ Annales de V’école normale, 3 sér., vol. 53 (1936), pp. 82-123. 

2H. S. Wall, “ Hypergroups,” American Journal of Mathematics, vol. 59 (1937), 
pp. 77-98; J. Kuntzmann, (1) “Opérations multiformes. Hypergroupes,” Comptes 
rendus, vol. 204 (Paris, 1937), pp. 1787-1788; (2) “Homomorphie entre systémes 
multiformes,’ Comptes rendus, vol. 205 (Paris, 1937), pp. 208-210; (3) “ Systémes 
multiformes et systémes hypercomplexes,” Comptes rendus, vol. 208 (Paris, 1939), 
pp. 493-495; Oystein Ore, “Structures and group theory, I,” Duke Mathematical 
Journal, vol. 3 (1937), pp. 149-174; L. W. Griffiths, “On hypergroups, multigroups, and 
product systems,” American Journal of Mathematics, vol. 60 (1938), pp. 345-354; 
M. Krasner, “Sur la primitivité des corps 9-adiques,” Mathematica, vol. 13 (1937); 
pp. 72-191. 

* Dresher and Ore, “Theory of multigroups,” American Journal of Mathematics, 
vol. 60 (1938), pp. 705-733. We shall in the following cite this paper as D. and 0. 


222 


Aas 


f 

| 
ob 
a 
J 

| 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS. 223 


of t. We make no assumption on the finiteness of the number of elements 
of Mt but we shall suppose that they are at least enumerable. The multiplica- 
tion which is defined in an m-system is subject to but two axioms: the existence 
of the product of any two elements and the associative law. 


Axtom 1. The Product. If m; and mj; are any two elements of Mt, then 

the product m mj; is a non-void subset of Wt. 
mim; = {m’x}. 

The existence of the product of any two elements of Yt permits us to 
give meaning to the notion of the product of any two subsets of Mt. If 
and 8 are two non-void subsets of Yt with elements {a;} and {b;} respectively, 
then an element m of Nt is in YB if and only if m is contained in some product 
ajb,. The relation of containing we shall symbolize in the usual manner. 
If Y% and B are any two subsets of Mt, 2 © B shall mean that every element 
of 8 is an element of 2. 

Axtom 2. The Associatwe Law. If mi,mj,m, are any three elements 
of M, then 


These products have meaning according to the definition of products of subsets. 
Kuntzmann * has noted several weaker forms of this associative law which 
are of interest in systems in which a non-unique multiplication is defined. 
We shall have occasion to refer later to one of these. 
Axiom 2’. The Left Associative Law. If mi, mj, m, are any three ele- 
ments of Mt, then 
(mim;) mi (mime). 


A multigroup is defined by Dresher and Ore® to be an m-system satis- 
fying the following quotient law. 

Axiom 3. The Quotient Axiom. For any ordered pair of elements mi, 
mj; there exist at least two elements x and y such that 

zm, mj; miy mj. 

3. Representation. Before we develop some of the properties of an 
m-system it would be well to derive a representation of such a system. Let 
be some multiplicative system satisfying Axiom 1 (we do not assume it is 
associative) and consisting of elements {mi}. To each m, associate a matrix 
M; defined in the following manner. If mim; —> mx, then M, has e, the unit 
of a Boolean ring, as its k, j-th element. The remaining elements of M, are 


* Op. cit., 1, p. 1787. 
°D. and O., pp. 706-707. 


224 J. E. EATON. 


zeros. We call the set of M,’s the left regular matrix association® of M. 
Similarly /; is an element of the right regular matrix association of Nt if 
when mjm; 2 m, then M; has e in its k, j-th place. We say that a matrix 
association is a representation if when mim; = {m’;}, then = 3M", 
where multiplication and addition of the matrices is defined in the usual 


manner. We then have the following theorem. 


THEOREM 1. The necessary and sufficient condition that a multiplicative 
system Wt be associative is that both its left and right regular matrix associa- 
tions are representations. 

Proof. (i) Necessary. Let a, b be elements of Mt, ab = {¢,, ¢2,° - -}, 
and A, B, C; the corresponding matrices of the left regular matrix association. 
Let there be ¢ in the i, k-th place of AB. Then for some j'there is ¢ in the 
i, j-th place of A and in the j, k-th place of B. Hence there is an mj; such that 
am; 2 m; and bm, 2 mj. Then abm, > m;. Hence for some ci, mi. 
Therefore some C; has e in its 1, k-th place. Conversely let there be e in the 
i, k-th place of some C. Then cm; 2 m;. Hence abm, © m;. Then for some 
mj; bm, am; 2 m;. Hence A has e in the i, j-th place and B has e in the 
j, k-th place. Therefore AB has e in the 1, k-th place. 

(ji) Sufficient. Let the left regular matrix association of Yt be a repre- 
sentation. Consider mi(mjm;,) m,. Then for some ms C mjmz, mims my. 
Hence M; has e in the s, k-th place and M; has e in the r,s-th place. Then 
MiM; has e in the r,k-th place. This implies that for some mC mim, 
mm; mr. We then have (mimj)m,. The reverse inequality 
is obtained if the right regular matrix association is a representation. Hence 
we have proved the theorem. However the left regular matrix association 
being a representation does not imply the right is. This may be shown by the 
system of two elements, a and b, whose multiplication scheme is: aa=4d; 
ab =b; ba=a,b; bb=b. 


CoroLuary. The necessary and sufficient condition that a multiplicative 
system be left associative (satisfy Axiom 2’) is that its left regular matrix 
association is a representation. 


4, Coset decompositions. We shall say that any subset % of an m- 
system is a subsystem if 2% > MW. We then have immediately that 1 itself 
is an m-system. The subsystems with which we shall be primarily concerned 
in this paper are the left reversible subsystems.’ A subsystem % of an m- 
system is left reversible if when Xm; > mj;, then Am; 2 m;. Similarly is 


° Cf. Wall, op. cit., p. 86; D. and O., p. 709. 
7 The concept of reversibility was introduced in D. and O., p. 715. 


I 
| 
| j t 
t 
! 
t 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS. 225 


right reversible if when ml mj, then mj% © m;. The set Mm; we call a 
left coset of %. The interesting feature of left reversible subsystems of an 
m-system is that their left cosets constitute a partition of the m-system in the 
precise sense stated in the following theorem.® 

THEOREM 2. Let % be a left reversible subsystem of an m-system We. 
Then M has a unique left coset expansion Mt = Am, + Wms, +: - + such that 

(i) any element in a coset generates the same coset; 
(ii) a coset contains its generating element ; 

(iii) every element of Mt lies in some coset; 

(iv) the cosets are disjoint. 

Proof. (i) Consider any coset 2m; and any m; contained in it. Then 
there is an a; in & such that aim; mj. From left reversibility there is an 
a; in & such that ajmj © mj. Since YW is a subsystem, % - Wa for all a in W. 
Hence Xm; Am; Aajym; Since the first and last ele- 
ments of the chain are identical, we must have equality throughout. Thus 


Wm; = Wm; for any m; in Wm. 


(ii) m, is in Wm, since m; is in Wm; and Am, = 
(iii) From (ii) every in Mt lies in the coset mj. 
(iv) Let m, be contained in both and %m;. Then Wm; = Wm; = 


Coroniary. The theorem is true under the weaker assumption that M 
is a left associative multiplicative system. 

We may mention that has a unique double coset expansion Yi = Um, B 
+%Um.B +--+ - with respect to a left reversible subsystem % and a right 
reversible subsystem 

5. Homomorphisms. Let us consider any partition of an m-system 
into subsets \,, -, where the subsets are not necessarily disjoint. 
We do however suppose that every clement of Yt lies in some Y;. We have 
already noted what we mean by the product Y;j; in the element sense. We 
may define the set product X;\; to be the totality of those subsets X,% which 
have at least one of their elements in the element product X;Xj. The 1;’s 
then obviously form an m-system which is homomorphic to the original m- 
system. By a homomorphism of an m-system Yt to an m-system Yt* we mean 
the usual many-one correspondence between the elements of Yt and the ele- 
ments of 9* which preserves multiplication. That is, every element of Wt 
corresponds to a unique image element of Yt* and every element of * is 
the image of at least one element of Mt. Furthermore if ms, mj, mz have the 


*This is a direct analogue for m-systems of Theorems 8 and 9, D. and O., p. 717. 
°Cf. D. and O., p. 718. 


15 


= 
== 

= 


226 J. E. EATON. 


respective images m*;, m*;, m*, and if then m*im*; 
Also if m*;m*; 0 m*, then for some mi, mj, mz corresponding to them 
mim; m,. However, in the homomorphisms which we shall consider we 
shall find it convenient to place certain restrictions on the inverse corre- 
spondence from {* to MW. A homomorphism from Mt to M* is a strong 
left unit homomorphism if I* contains left scalar units and if for any m 
and m’ with the same image there is an element a in Yt corresponding to some 
left scalar unit of Mt* such that am > m’. 

An element e of an m-system Wt is a left scalar unit *° if em =m for 
every m in Yi. An element which is both a left scalar unit and a right scalar 
unit (me = m) is called an absolute unit. Obviously there‘can be at most one 
absolute unit in any m-system. A set of left scalar units of an m-system * 
form a system of fundamental units with respect to a strong left unit homo- 
morphism of Yt to Yt* if for any one of them there is an a in Mt corresponding 
to it such that for some m and m’ with the same image am > m’ and if for 
any m and m’ with the same image there is an a corresponding to one of them 
such that am © m’. We are now in a position to characterize completely the 
homomorphisms generated by the left coset decompositions of left reversible 


subsystems." 

THEOREM 3. Let & be a left reversible subsystem of an m-system Yi. 
Then M is strongly left unit homomorphic to the m-system of the left coset 
expansion of Mt with respect to A, Wt/A, and the homomorphism has as a set 
of fundamental units those cosels containing elements of A. Conversely by 
any strong left unit homomorphism Yi— M* those elements of W which 
correspond to any set of fundamental units of M* form a left reversible sub- 
system XM of Mt such that the m-system of the left coset expansion M/A 4s 
isomorphic to M*. 

Proof. Since & is a subsystem, 2% —> %a for any a in %f. This implies 
AAD Furthermore Combining we have and 
Am D AaXm for any m in Wt. However, since there is but one coset on the 
left, we must have equality. Thus the cosets containing elements of % are 
left scalar units. From Theorem 2 if m and m’ lie in the same coset there 
is an a in % such that am > m’. Hence the homomorphism is a strong left 
unit homomorphism which has as fundamental units those cosets containing 
elements of 2%. 

Conversely, let {e*;} be a set of fundamental units of Mt* and Y the 
totality of elements of MN corresponding to them. Suppose WW —-> b, where 


© For a complete discussion of units cf. D. and O., pp. 710-714. 
11 This is an improvement of D. and O., Theorem 13, p. 721. 


( 
\ 
f 
n 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS. aw 


{e*;} {e*:}, which yields a contradiction. ‘Thus is a subsystem. 
If for some a in % am > m’, then m and m’ have the same image m*. Since 
the homomorphism is a strong left unit homomorphism, there is an a’ in %& 
such that a’m’ Dm. Hence % is left reversible. But the same argument 
shows that two elements lying in the same coset of 2f have the same image and 
two elements with the same image lie in the same coset. Therefore Qt* is 
isomorphic to the m-system of the left coset expansion Yt/. 

Corottary. If Mis a multigroup then M* is a multigroup and is a 
right submulligroup (that is, in %& the second relation of Axiom 3 is satisfied). 


b is not in &. Then {e*;}{e*;} 0 b*, where b* is not in {e*;}. But 


Let us now introduce an extremely important class of subsystems, those 
which we shall call left normal subsystems. A subsystem 2 of an m-system Wt 
is left normal in Mt if for any m in Mt we have Xm mY.” With this type 
of subsystem we may associate a homomorphism in which we impose a more 
rigid condition on the inverse correspondence than we assumed in the previous 
theorem. We say that a homomorphism Yt to M* is a strong left homo- 
morphism if whenever in Yt* m*;m*; 0 m*; there is for every mj and my in 
$M corresponding to m*; and m*, respectively some m, corresponding to m*; 
such that mim;  m,.7® We then have the following theorem." 


THEOREM 4. Let %& be a left reversible, left normal subsystem of an 
m-system WM. Then Mt is strongly left homomorphic to the m-system of the 
left coset expansion M/X and M/X has as an absolute unit WM. Conversely 
by any strong left homomorphism Wt —+ M* wherein M* contains an absolute 
unit e*, those elements of Mt which correspond to e* form a left reversible, 
left normal subsystem MX of Mt such that the m-system of the left coset 
expansion is 1tsomorphic to M*. 

Proof. Let Xm Am, Am;. If rC s C Wms, we have to find an 
zin Mm, such that ars. Since any element in a coset generates the coset 
we may write %m,%r Os. From left normality, UWm srs. Therefore 
2s. Hence for some a in > s and for some in am, s. 
We thus have a strong left homomorphism. From the preceding theorem 
we know that any coset containing an element of 2% is a left scalar unit. But 
for any m in Yt and any a in YM we have Am D AAm D AmA D AmAa. 
Since there is but one coset on the left we must have equality and Ya is a 


I have learned by correspondence that M. Krasner has also used this type of 
normality. 

** This type of homomorphism was introduced in D, and O., p. 721. 

“The same theorem for normal, reversible submultigroups is proved in D. and O., 
Theorem 1, p. 724. 


228 J. E. EATON. 


right scalar unit. It is thus an absolute unit and since there can be but one 
absolute unit in any m-system, Ya = . 

To show the converse we need only, by virtue of the preceding theorem, 
prove that %f is left normal. Consider any rC ma. We know r and m have 
the same image since e* is an absolute unit. From the strong left homo- 
morphism there exists an a’ such that a’m r and hence 2 is left normal. 

Corotiary. If Mis a multigroup, then rs a submultigroup. 

6. Conjugate subsystems. In seeking an analogue to strong normality 
in multigroups *° we are led to the following notion. A subsystem % of an 
m-system $M is left scalic if for any m; and mj; in Mt there exists an m’; such 
that 2m’; > mi%lmj;. We cannot show, as in the case of multigroups, that a 
left scalic subsystem satisfies a normality condition. However we may show 
that if it is left reversible its cosets form a scalar m-system; where by scalar 
m-syslem we mean an m-system in which multiplication is unique; that is, 
the product of any two elements of the system is a single element. 

THEOREM 5. The necessary and sufficient condition that an m-system WM 
be strongly left unit homomorphic to a scalar m-system IW* is that the m- 
system of the coset decomposition of Mt with respect to some left reversible, 
left scalic subsystem X 1s isomorphic to M*. 

Proof. In view of Theorem 3, it is only necessary to show that if & is 
left scalic, M/W is a scalar m-system; and if Mt* is a scalar m-system, YW is 
left scalic. Since for any m; and mj; there is an m’; such that Xm’; 0 mim, 
we must have %m’; 0 Am,%Xm;. As there is but a single coset on the left, 
we then have equality, and hence the product of any two cosets is a single. 
coset. Conversely if for any 2m; and %mj; we have Mm’; = Am, Wm; we then 
must have %m’; > m,%Xm; since WX is left reversible. Thus Y is left scalic. 

The concept of left scalic subsystems leads naturally to the notion of left 
conjugate subsystems. 'T'wo subsystems 2%, 8 are left conjugates if there exist 
some m and some m’ such that for any mj; there exist m,’ and m,;” which 
satisfy the following relations: 

mBm; Bm;"” Dm’Am; Am; mBm;”. 

It is obvious that if the m-system considered is a group then the above con- 
struct is the ordinary conjugate of a group. 

TnEorEM 6. Left conjugate is a symmetric, reflexive, transitive relation. 

Proof. The symmetry is obvious. The reflexiveness follows from the fact 
that % is a subsystem, for we may choose m and m’ as elements in 2%. Then 
if we choose m,’ and m,;” equal to m; we find that we may take 8 = % in the 


18 A strongly normal submultigroup is one whose coset expansion forms a group; 
cf. D. and O., p. 728 ff. 


de 


| 
1 
x 
a 
v 
le 
’ 
p 
| tl 
| it 
i 
| 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS. R29 


definition of left conjugate. To establish the transitivity assume that we have 
given subsystems %f, B, and © satisfying 
(1) Um’ (mBm; Cni DPD 
(2) Am; D D n’Bn, nCn,”. 
Select mj =n,’ in (1). Then 
Wm ;’ mBni’ D mnCni > rCni, rC mn; 
Cn; n’Bni! D n’m’/Am;’ D Cn’m’. 

Select nj = mj” in (2). Then 
Since mj; is arvitrary we may replace it by the n; of the preceding relation. 
We then see that % and © are left conjugates, for note that r and 7” are inde- 
pendent’ of ‘m; and nj. 

THeoreM 7. A left reversible, left scalic submultigroup 7° is its own only 
left reversible left conjugate. 

Proof. Suppose YM is left reversible and left scalic and there exists a left 
reversible subsystem 8 such that 

(1) Wm,’ 0 mBm; (2) Bm; > m’Am,’ 
(1’) Bm,” m’Am; (2’) Wm; mBm,;”. 

It is easy to show that % is a normal submultigroup.*’ But this, together 
with (2), yields Bm; > Yr for any m; and some r depending on m;. Choose 
mj=b. Then 8 Xb. If we multiply through by B on the right, we obtain 
BO AB = BA OW. We now establish the reverse inequality. From (2’) 
and (1’) we have %m; > AXmm’Am;. As there is but one coset on the left 
we must have equality and mm’ is in %& since m; is arbitrary and % is the only 
left scalar unit. In (2) choose mj =m,” and multiply through by m on the 
left. Then from (2’) 2m; > %m,j’. Hence there are mj in (2) such that 
m;’ is arbitrary. Choose s so that sm > b for some b in 8. From (1), multi- 
plying through by s on the left, we have s&m,’ Hence ABm;. 
Since 9 is left scalic the left side is a single arbitrary coset. We may choose 
that coset equal to %. Then %  Bmj;% for some m;. But in a two sided 
coset expansion any coset contains its generating element,'* and so mj; is in YF. 
Hence 9 D BY — ABB. Since we have already established the reverse 
inequality, B. 

We define the left normalizer of a left reversible subsystem 9 of an 


** We may readily show that such a submultigroup is strongly normal in the sense 
defined in the paper by Eaton and Ore, “ Remarks on multigroups,” American Journal 
of Mathematics, this number, pp. 67-71. 

Cf. Eaton and Ore, op. cit. 

“Cf. D. and O. Theorem 10, p. 718. 


| 

| 

= 


230 J. E. EATON. 


m-system Yt to be the totality of elements n; in Yt such that for each n; there 
is an n;’ for which for any m in Mt there is an m’ and an m” which satisfy 
the following relations: 

Xm’ OnAm Amd n/Am’ Wm’ On/Am Am nj Wm”. 

THEOREM 8. The left normalizer N of a left reverisble subsystem YX is a 
left reversible subsystem in which % is left scalie. 

Proof. Let n, and nz be two elements of Jt. Then from the symmetry of 
the definition of left normalizer, n,’ and n2’ are in Jt. Let w be any element 
in the product n,n, and select v as some element in the product no’n,’. In the 
definition of left normalizer let m, (the arbitrary m associatéd with n,) equal 
We then obtain %m,’ uMm. and Am. If we now select 
we find Xm.” vAm, and Am,  u%m.”. Since in the derived 
relations both m, and mz are arbitrary, u is in 9 and Mt is a subsystem. Suppose 
nr s for some n in %. Take mr in the definition. Since % is left 
reversible we may take m’=—s. Then we have Ar n’As. If we multiply 
through by % on the left and observe that the left hand side is a single coset 
we have Ur = An’Ms. Then for some a, an’s Dr. But all the elements of W 
are in Jt and hence every element in the product an’ is in %. But for some 
m, in this product we must have ns r. Therefore Jt is left reversible. 
That %f is left scalic in 9 follows immediately from the definition of left scalic. 

7. Structure properties of normal subsystems. Let us in this section 
derive certain structure theorems for left normal, left reversible subsystems 
of an m-system which will permit us to formulate a Jordan-Hdélder theorem 
for these subsystems.?® 

THEOREM 9. Let M be an m-system and A and ®B left reversible, lefl 
normal subsystems. Then the crosscut (%,%B) is a non-void subsystem which 
is left reversible and left normal in both X and B. 

Proof. is non-void since, from left normality, BY 
Hence there is an a and a b such that ab Da’. From left reversibility there 
is an a” such that aa’ > b. Since 2 is a subsystem b lies in Y and hence in 2. 
D is a subsystem since DD lies in both Y and B and hence in D. Suppose for 
some d, a,, and we have da, Then for some b’a, a;. But 
Wb’ This implies that for some a’ in a’b’ Da,. From left reversi- 
bility a”a, > b’ and as before D’ lies in & and hence in D. Thus ® is left 
reversible in & and similarly in 8. To show left normality, consider any 
a, aD. Then for some b, ab a,. Since Ba- aB there is a b’ such that 


*® All the results contained in this section have been previously proved for normal, 
reversible submultigroups in D. and O., pp. 725-727. Our extension to left normal, left 
reversible subsystems is, of course, equally valid when the m-system is a multigroup. 


| 
| 
| ( 
Is 
01 
al 
} fr 
a 
| | 


ASSOCIATIVE MULTIPLICATIVE SYSTEMS. 


b’'a > a,. The left normality of D in Y& follows immediately if we can show 
that b’ lies in &. We know 2b’ > b’M and thus there is an a’ such that 
ab’ Da,. Then for some a”, aa, b’ and as before b’ lies in 2 and hence 
in D. Similarly D is left normal in ¥B. 

THeoreM 10. Let M be an m-system and YW and B left reversible, left 
normal subsystems. Then is the union [M,B] a left reversible, left normal 
subsystem and [M, B] = UB. 

Proof. AB = AAVB  ABAB. Hence AB is a subsystem. But AB B. 
Also YB BY > YW. Since [Y%,B] is the least subsystem containing both 
% and B we must have [M,B] = AWB. Furthermore WB is left reversible, for 
suppose rm 2 m’ where ab Dr. Then abm © m’. Hence there is an x C bm 
such that az 2 m’. There is then an a’ and a Db’ such that a’m’ wa and 
Hence there exists an s C 0’a’ such that sm’ 0m. But AB BA 
since they are both equal to [M%, 8]. Thus s is in YB and MB is left reversible. 
Finally 9B is left normal since ABm AmB 0 mAB. 

THeorEM 11. The Dedekind Relation. Let Mt be an m-system and Y, 
B, and © left reversible, left normal subsystems such that © 0%. Then 

(C, B]) = [W, (C, B) ]. 

Proof. Let s be in [%f, (€,B)]. Then s is in 2% and hence in © and 
also in Thus s is in (C, [M,B]). Any element in (@, [%,B]) is in ©. 
Let c be such an element. We then have for some a and some b, ab Oc. From 
left reversibility there is an a’ such that a’c © b. Since a’ is in © so also is b. 
Thus ¢ is in both 9B and XC and is consequently in [%M, (C, B) ]. 

THEOREM 12. The Isomorphism Theorem. Let Wt be an m-system and 
Wand B left reversible, left normal subsystems. Let X/Y denote the m-system 
of the left coset expansion of X with respect to Y. Then 

= B/ (A, B). 
Proof. Let (%,B) and —=AB—MN. Let 


(1) 
he the left coset decomposition of 8 with respect to D. Then 


is the left coset decomposition of 9 with respect to % for firstly every element 
of N lies in some coset of (2) since every element of B lies in some coset of (1) 
and Y= 2D. Further the cosets of (2) are distinct since if ab; 0 b; then 
from left normality b’a > b; and from left reversibility a is in 8 and hence 
in. But we would then have a contradiction of the assumption that (1) is 
a left coset decomposition. Now let %b,%b; > %b,. Then from left normality 
bib; D Mb, and for some a, abib; D by. But as before we must then have 


232 J. E. EATON. 


a lies in 8 and hence in D. This implies Db;Db; 0 Dbz. Conversely if 
DbiDb; Db; then for some a and a’, abja’b; b;, and Ab; Ab; The 
isomorphism is thus established. 

The preceding theorems are sufficient to prove an analogue of the Jordan- 
Hélder theorem for m-systems. A chain of subsystems %, 
= % is a left composition series between YX and B when each Yi is a left 
reversible, left normal subsystem in the preceding and when no other terms 
can be intercalated in the chain. We then have: 

THEOREM 13. Any two left composition series between UA and B have 
the same length and the m-systems of the left coset expansion of consecutive 
terms in one series are 1somorphic in some order to those in the other series. 

The proof of the theorem is by induction on the length of the chain and 
is entirely similar to the proof of the corresponding theorem in group theory. 

We may mention that we may define, as in group theory,”° a left quasi- 
normal subsystem 2 of an m-system Yt to be such that YW BA for every 
subsystem 8 of Yt. It is then possible to formulate a Jordan-Holder theorem 
involving strong structure isomorphism for the left quasi-normal, reversible 
subsystems. 

8. Coset decomposition of groups. We may use the preceding results 
to characterize completely the coset decomposition of a group with respect to 
any subgroup. 

THEOREM 14. The necessary and sufficient condition that a partition of 
a group & into disjoint subsets be the left coset decomposition with respect to 
a subgroup § is that the m-system Mt of the partition be such that 

(i) Mt contains a left scalar unit FE; 

(ii) XA A implies X = E, 

Proof. Necessary. § is obviously a left scalar unit of Mt. Suppose 
$9i99; © $g;. Then for some h and h’ in §, hgih’g; =g;. This implies 
gi isin § and ji = §. 

Sufficient. If G is finite the elements corresponding to / obviously form 
a subgroup. If © is not finite we still must have that e, the unit of §, 
corresponds to # since F# is the only left unit. Furthermore if the inverse 
of some element corresponding to # corresponds to X, we must have YH > L 
and hence X =H. Now let r and 7” be two elements of © with the same 
image R. We may determine zx in & such that er =7’. Then YR FP and 
X =F. Thus the homomorphism of & to M is a strong left unit homo- 
morphism and by Theorem 3 Yt is isomorphic to the m-system of the left coset 
expansion of ( with respect to the subgroup corresponding to F. 


YALE UNIVERSITY. 


*° Cf. Oystein Ore, op. cit., p. 162. 


| 
( 
( 
t 
t 
i 
| ( 
a 
t 
h 
I 

| 


A NEW PROOF FOR A METRICALLY TRANSITIVE SYSTEM.* 


By Gustav A, HEDLUND. 


1. Introduction. Two distinct methods have been used to prove that 
the flows defined by the geodesics on suitably restricted surfaces of constant 
negative curvature are metrically transitive. The first of these [2,3] * involves 
the use of symbolism to characterize the geodesics and the proof is restricted 
to those surfaces for which a suitable symbolism has been devised. The second 
of these methods [6, 7,8] makes use of the theory of harmonic functions and 
is valid for all complete surfaces of constant negative curvature and of finite 
area. It is the opinion of the author that both of these methods involve 
excessive machinery and it would seem desirable to derive a more simple and 
straightforward proof of the result under discussion. 

The present paper gives a new method of proof of the metric transitivity 
of the flow defined by the geodesics on any closed orientable surface of con- 
stant negative curvature. It seems to the writer that the present proof is 
considerably simpler than any previously given. The method extends readily 
to the general class of complete surfaces of constant negative curvature and 


of finite area. 


2. Two-dimensional manifolds of constant negative curvature. Let 

V be the interior of the unit circle U, x? + y*? =1. To © we assign the metric 
4(dx* + dy”) 
c(1 2? — y?)?’ 
the Gaussian curvature of which is —c. The metric (2.1) assigns a length 
to curves in W and this length is termed hyperbolic length or H-length. Angle 
is euclidean angle and the element of (hyperbolic) area is 
4dady 

c(1—a*— yy’)? 


c> 0, 


(2.1) ds? == 


(2.2) doa = 


The geodesics defined by (2.1) are arcs of circles orthogonal to U and 
are called hyperbolic lines or H-lines. An H-line is uniquely determined by 
two points of U and these points are the points al infinity of the H-line. The 
hyperbolic distance or H-distance between two points of © is defined to be the 
H-length of the unique H-line segment joining the points. 


* Received November 17, 1939. 
*The numbers in brackets refer to the bibliography. 


233 


| 
‘ 


234 GUSTAV A. HEDLUND. 


A horocycle is a euclidean circle internally tangent to U. The point 4 
of contact of the horocycle with U is the point at infinity of the horocycle, 
Let # denote the minimum H-distance from the origin 0 to an arbitrary point 
of the horocycle, and let r = + 7 or —# according as O is interior or exterior 
to the horocycle. The point at infinity A and the constant 7 uniquely deter- 
mine a horocycle and this horocycle will be denoted by ((A, 1). The horocycle 
('(A, 1r) is an orthogonal trajectory of the set of H-lines having A as one point 
at infinity. 

The metric (2.1) is invariant under linear fractional transformations 
which take W into W, so that under such transformations, hyperbolic distance, 
angle and area are invariant. 

Let F be a Fuchsian group with U as principal circle (cf. [1], Ch. III). 
Such a group has a normal fundamental region containing the origin and 
bounded by H-lines or H-line segments which are congruent in pairs. If to 
this domain is added a suitably chosen subset of the vertices and a suitably 
chosen subset of the sides, the resulting region RF is such that no two points 
of it are congruent and any point of W is congruent to some point of Rh. We 
assume that F is such that the closure of R contains no points of U. It follows 
that F is of the first kind and has a finite set of generators, while the region 
R has a finite set of sides. 

If points which are congruent under F’ are considered identical there is 
defined a closed two-dimensional manifold M(F',c) of constant negative curva- 
ture. In the case in which M(/F,c) contains no singular points, all the 
transformations of F are hyperbolic. The genus of M(F,c) is then necessarily 
greater than one. 

An element e in © is a point P of © together with a direction at that 
point and can be specified by three codrdinates (7, y,), where x and y are 
the codrdinates of P and ¢, 0 = ¢ < 2z, is an angular codrdinate measured 
positively in the counterclockwise sense from the directed H-ray which has P 
as initial point and is part of the directed H-line which passes through P and 
has (1,0) as initial point at infinity. The point P is the point bearing the 
element e. A neighborhood of the element (2, 4:,¢:) is the set (2, y,¢) 
such that 

H(P,Pi1) <8, <8 


where P is the point (z,y), is the point (x, y,), H(P,P,) denotes the 
H-distance between P and P,, || ¢— ¢, || denotes the least value of the set 
+ 2nr|, (n=0,+1,+2,:--), and 8>0. Let denote the 
space of elements in Y with neighborhoods thus defined. 

A transformation of F carries an element into a congruent element. The 


i 


A NEW PROOF FOR A METRICALLY TRANSITIVE SYSTEM. 235 


space Q of elements on M(F, c) is the space obtained by identifying congruent 
elements of €. If neighborhoods in © are defined by correspondence with the 
neighborhoods defined in € (cf. [9], p. 32), @ is a Hausdorff space. It is 
easily shown that © is separable and regular and it follows that a metric 
yielding an equivalent topology can be assigned to 2. With such a metric 
assigned, the diameter of a subset of Q is defined. 

An element (2, y,¢) determines a unique point of Q, and this point will 
be called either the point of Q determined by (x,y,@) or simply the point 
(t,y,) of Q. 

If measure is defined in € by means of the volume element 


dodd, 


where do is given by (2.1), congruent measurable sets of € have the same 
measure and this measure serves to define measure in © (cf., e. g., [4]). The 
hyperbolic lines or geodesics define a geodesic flow Gs in O (cf. [4]) and Gs 
is a measure preserving transformation of Q into itself which is defined for 
each real s. 


3. The tubular property of invariant sets. Let A be a point of U 
and let « be the smallest positive angle which the radius OA forms with the 
positive a-axis. The points of U are then in 1—1 correspondence with 
the interval 0 = « < 2 and the horocycle C(A,r) can be denoted by C(a,1r). 

Let C(a,7) be a horocycle, some point of which is in the region R. The 
points of C(a, 7) in R form one or more connected arcs of C(a,7). Let this 
set of arcs be denoted by a(a,7) and let the number of arcs in this set be 
N(a,7r). Since a horocycle can cross an //-line in at most two points, the 
number N (a, 7) is not greater than twice the number of sides of Since 
is determined by J’, and since, under the assumptions we have, made, F has a 
finite number of sides, there exists a uniform upper bound N, for the integers 
N(a, 1), where this upper bound is determined by F. Let (a, 7) denote the 
H-length of the shortest are of C(a,r) which contains all the ares a(a,r). 
Since the closure of 2 contains no points of U, there is a uniform upper bound 
I, for the numbers L (2,7) and again Lag is completely determined by F. 

Consider the set of arcs a(a,7) and the elements externally normal to 
((a,7) at the points of a(a,r). The points of Q which correspond to these 
elements form a set of arcs de(a,7) of Q and the totality of arcs ae(a, r) 
obtained by considering all admissible values of « and r form a set which is 
identical with ©. By considering elements internally normal to ((a,1) we 
obtain similarly a division of © into ares aj(4, 1). 

Since the closure of F lies interior to U, the values of 7 assumed in the 


236 GUSTAV A. HEDLUND. 


determination of the ares a¢(%,7) or ai(@,7) lie between two suitably chosen 
constants 7, and r, such that 7, <0 It follows that the values of (2,1) 
determining arcs de(a, {ai(a,1)} form a set of the rectangle R, 
oS a< Be, 71 < < 

Let the rectangle ®@ be divided into a net A, of n? rectangles by » lines 
parallel to the a-axis and n lines parallel to the 7-axis. We assume that the 
sequence of nets 4,, A,,- - - has been so determined that the maximum dia- 
meter (measured in terms of euclidean distance in &) of the rectangles of 
the net A, approaches zero as n becomes infinite. It is to be understood that 
any one of the rectangles of the net A, includes just one vertex, namely the 
lower left corner, and two open sides, the lower horizontal and the left vertical. 
The division of @ into the net A, effects a division of the set @.{Ci} into 
subsets, a subset being the points of G.{Ai} in a rectangle of the net he, 
The totality of arcs ae(a, 17) {ai(a,r)} of Q corresponding to the points of 
G.{C;} in a single rectangle of A, will be called a tube of Q. Thus, corre- 
sponding to the net A, there is a division of Q into two sets of tubes and we 
will denote these sets by T-(n) and 7;(n) respectively. A tube is evidently 
a measurable set and for each positive integer n, Q is the sum of the tubes of 
Te(n){Ti(n)}. 

Let H and G be measurable sets of Q and let G be of positive measure. 
The relative density of the set FE in the set G is defined to be the number 
m(E- G)/mG. 

THEOREM 3.1. Let H be a measurable invariant set of Q. Then, 
given « > 0, there exists a positive integer N such that if n > N, the set E£, 
except possibly for a set of measure less than «, lies in tubes of T.(n){Ti(n)} 
in which the relative density of the set E\ is at least 1—e. 


The proof will be restricted to the case of tubes of T-(n). An analogous 
proof applies to the other case. 

Under the measure preserving transformation G, of Q into itself, the set 
de(a,1r) is transformed into a set of 2 determined by the elements externally 
normal to a set of segments of the horocycle C(a,r-+ s). Let Ls(%,7) 
be the H-length of the shortest segment of C(a,r-+ s) containing the set 
As s—>— Since the H-lengths r) = L(4,r) 
are uniformly bounded by the constant La, the numbers L,(a, 1) approach 
zero uniformly as s—>— . It follows that the diameter of the set of 
determined by the elements externally normal to C(a,r-+ s) along os(2,7), 
and hence the diameter of the set Gs(de(%, ")), approaches zero as s—>— ®. 

Let /.(n) denote an arbitrary tube of the set T.(n). As s—>— ~, the 
diameter of the set Gs[te(”)] does not in general approach zero. But if n is 


A NEW PROOF FOR A METRICALLY TRANSITIVE SYSTEM. 237 


large, the tube ¢-(”) consists of elements uniformly near the elements of a 
set We(%, 7) and if 5 < 0 is properly chosen, the diameter of the set Gs[te(n) ] 
will be small. Let s, denote a value of s for which the maximum diameter 
cf the sets Gs|¢te(”) | is a minimum, s ranging over the interval — 0 <s <0, 
while ¢¢(m) ranges over all tubes of the set 7'-(n), and let this minimax 
diameter be d(n). It is then geometrically evident that lim d(n) = 0. 


Since, for any given positive integer n, the tubes ¢(n) of T-(n) form a 
division of 2 into non-overlapping measurable subsets whose sum is Q, the 
same is true of the sets G,,[te(m)]. Since Gs, is a measure preserving trans- 
formation of © into itself and F is an invariant set of Q, it follows that 


m{Gs,[te(m)]} = m[te(n)], 
and 
m{Gs,[te(n)]-#} = m[te(n) - 


Hence the relative density (if it exists) of the set H in t,(n) is identical with 
the relative density of # in the set G.,[te(n)]. To prove the stated theorem 
it suffices to prove the following lemma. 


LeMMA 3.1. Let be a measurable set of and let An, n= -, 
denote a dwision of Q into a set of subsets called cells such that: (1), the 
number of cells in Mn is finite; (2) the sum of the cells in An is Q; (3), each 
of the cells forming A, is measurable; and (4), if dn denotes the maximum 
diameter of the cells of An, limdn==0. Then given «> 0, there exists a 


positive integer N such that if n > N, the set Ff, with the exception of a set 
of measure less than e«, lies in cells of An in which the relative density of EF 


is at least 1 —e. 


Since # is a measurable set and mQ < ~, corresponding to « > 0, there 
exists an open set H, of such that and < &/2. 

Let A*, denote the subset of cells of A, lying in Zo. The set A*n con- 
tains any point of Hy which is the center of an open sphere of radius dy made 
up of points of Zo. It follows that any point of H, lies in the set A*, for n 
sufficiently large and hence, corresponding to « > 0, there exists a positive 
integer N such that m(H,—- A*,) < «/2, provided n> N. The following 
inequalities evidently hold. 


(3.1) m(A*, —H-A*,) n>N. 


(3. 2) m(E — E- A*,) S — A*n) < €/2, n>wWN. 


Let A*, denote the set of cells of A*, in which the relative density of 
the set F is less than 1 —e. It follows that 


238 GUSTAV A. HEDLUND. 


emA*,, = m(A*,, — A*,- E), 
and since 
A*,, — A*,-E CE, — E, n>WN, 


we infer with the aid of (3.1) that 
(3. 3) mA*, < n>WN. 


Except possibly for the sum of the sets H—-H-A*, and E A*,, the set 
E lies in cells of A*, in which the relative density of EH is at least 1—e. 
But the measure of each of the sets H — E- A*, and EF - A*, is, according to 
(3.2) and (3.3), less than «/2 if n > N, and thus we can infer the truth 
of the stated lemma. 

The proof of Theorem 3.1 is complete. 

The following evident extension of Theorem 3.1 will be useful. 


THEOREM 3.2. Let T*.(n){T*i(n)} denote a dwisien of each of the 
tubes of T.(n){Ti(n)} into measurable non-overlapping subsets such that 
for each n the number of the sets in T*,(n){T*;(n)} is finite and their sum 
is Q. Let E be a measurable invariant set of Q. Then gwen « > 0 there 
exists a positive integer N such that if n > N, the set E, except for a set of 
measure less than «, lies in sets of T*-(n){T*i(n)} in which the relative 
density of the set E 1s at least 1 —«. 


4. Metric transitivity. Let ¢ be the element (z,y,¢). This element 
can also be specified, by three codrdinates (#,r,h), where « and r are the 
numbers determining the horocycle C(«,1r) which passes through (2, y) and 
has e as an exterior normal element, while h is the oriented hyperbolic arc- 
length on C(a,7r), measured positively in the clockwise sense on C'(a,r) from 
the point of C(a,1r) which is nearest the origin. The transformation from 
(z,y,p) to (a,r,h) is analytic with non-vanishing Jacobian in the set 
ety 2m. 

Let 2* denote the subset of determined by the elements (2, y, ¢) such 
that (x,y) is in the interior of the region R and 0 < ¢ < 2x. It is evident 
that m(Q—*) =0. If HL is a measurable subset of 2, the measure of the 
set £-Q* coincides with that of # and by the transformation from (z, y,¢) 
to (a,7,h) defined above, the set H-2* can be represented in the («, r,h) 
space by a measurable bounded set, the measure of which is defined to be that 
of H-Q. If the metric density of H at any point of Q* is defined by means 
of cubes in the (a,7r,h) space, it is well known that the metric density of 7 
is 1 at almost all points of E and 0 at almost all points of O— EL. 


| 
i i] 
| li 
| 
| 
| 
| 
(4 


A NEW PROOF FOR A METRICALLY TRANSITIVE SYSTEM. 239 


LEMMA 4.1. Jf €:(01,7:,41) and @2(01,1:,h2), hi < he, are elements 
such that the points bearing all the elements 


(4,71, h), h, 


are interior to R and if the metric density of the measurable invariant set F 
is 1 at e;, then the metric density of FE af ez is also 1. 


We assume in the proof of this theorem that «4,0. If a, were zero, 
a slight rotation of the region R would permit the application of the given 
proof. 

Under this condition the constant 6 > 0 can be chosen so small that all 
the points (a, 7, satisfying the condition 


|r—n| <8 


are in 2*, We denote this set by Bs. The subsets of Bs determined by the 
inequalities 


|a—a|<8 |r—r|<8 |h--hi|] <8, (‘—1,2), 
will be denoted by Cs and Ds, respectively. If we let 


m(E - Cs) 


it follows from the hypotheses of the lemma that lim As—1. For the 


60 


moment we hold 8 fast. 

Let T.() be a division of © into tubes as defined in § 3, and let the sets 
T*.(n) be determined as follows. If a tube of 7-(m) contains no points of 
Bs, the tube is a set of T*.(n), while if a tube t-(n) of Te(n) contains a 
point of Bs we divide t-(m) into two sets te(n) - Bs and te(n) —te(n) - Ba, 
both of which are sets of T*.(n). The sets 7*-(n) then fulfill the conditions 
imposed in Theorem 3.2 and given « > 0, there exists a positive integer N 
such that if n > N, the set #, except possibly for a set of measure less than e, 
lies in sets of 7*-(n) in which the relative density of F is at least 1—e. 
Let the set obtained by excluding the exceptional set from FP be denoted by 
K*,. Then m( 2 — k*,) < if n > N, and we assume that n is so chosen. 

The subsets of 7*,(n) containing points of Bs form a set of rectangular 
parallelepipeds (in (#,7,h) space) 


(4.1) t;", (t == 
The sets 
(4. 2) ti" Cs, (i = 1, 2,- -,v(n)), 


240 GUSTAV A. HEDLUND. 


form a division of Cs into non-overlapping subsets (rectangular parallelpipeds) 
and let 

(4. 3) Cz, (i == 1,2,---,m(n)), 
denote those sets of (4.2) which contain points of #*,. It follows that 


ulin 


) 
(4. 4) m( > ti,”-Cs) 2 m(E*,- Cs). 
_ k=l 


Since 
C5) = AsmCs, 


it follows from the condition m(H— H*,) <« and (4.4) that 


(4. 5) m( > ti,%-Cs) AsmCs 


But the transformation 


€. 


a=a, r=r, h=—h- const. 
is measure preserving in Q (cf. [5]) and thus 
(4.6) mCs—mDs; m(ti"-Cs) =m(ti"- Ds), (t= 1,2,°°-,v(n)). 


We infer from (4.5) and (4.6) that 


(4. 7) m( > Ds) AmDs — «. 
k=1 
Since 
m(ti,"- E*,) = (1—e) mti,", == 1,2,---,p(n)), 


it follows that 
(4. 8) m{H*, (ti," Ds)} = m{ti," - Ds} —emti,", 
== 1,2,---,p(n)). 
Summing over k = 1,2,-- -,p(n), we obtain 
m(E*,- Ds) = m( (ti,"° Ds) ) —em( 


From this inequality and (4.7) we infer that - 


m(E*, Ds) = — « — 


whence 
(4.9) m(H Ds) > m(E*, Ds) mBs 
mDs5 mDs5 mD5 mDs5 


Since, for a given 8 > 0, « can be chosen arbitrarily small, we infer that 


Ds) 


> 
mDs5 = 


|| ( 
- 
‘ 
( 
] 
é 
if 
( 
| j 
t 
h 
q 
| 0 
( 
| 


A NEW PROOF FOR A METRICALLY TRANSITIVE SYSTEM. » R4t 


This implies that the lower metric density of E at (1, 1,h2) is at least as 
great as the metric density of H at the point (%, 71,41). But the latter was 
assumed to be 1, and the statement of the lemma is proved. 

The element obtained by rotating a given element through 180° is termed 
the element opposite the given element. The proof of the following lemma is 
closely analogous to that of Lemma 4. 1, and will be omitted. 


LEMMA 4.2. If and hi < he, are elements 
such that the points bearing all the elements 


(01,11, h), h, Sh=h, 


are intertor to R, and if the metric density of the measurable invariant set EF 
ts 1 at the element opposite e,, then the metric density of E at the element 
opposite 1s also 1. 


THEOREM 4.1. (Metrical Transitivity.) If EF is a measurable invariant 
set of Q, either mE = 0 or m(Q— F) = 0. 


It is sufficient to show that if m# > 0, then mE =mQ. If > 0, 
Ff contains a point p(2,y,¢), (#,y) interior to R, such that the metric 
density of EK at pis 1. Since # is invariant under the measure preserving 
geodesic flow G,, each element on the directed geodesic of which p is an ele- 
ment will be a point of # at which the metric density of His 1. Since (2, y) 
is interior to R, there is a connected arc de of the horocycle C, of which p is 
anormal exterior element such that de contains (x,y) and lies in the interior 
of R. According to Lemma 4.1, each element which is externally normal to 
(; at a point of de is a point at which the metric density of EF is unity. 

Similarly, there is an are a; of the horocycle C, of which p is a normal 
interior element such that a; contains (z,y) and lies in the interior of RP. 
According to Lemma 4. 2, each element which is internally normal to C, at a 
point of a; is a point at which the metric density of F is 1. 

Thus there are three possible transformations of an element such that if 
the metric density of H at the element is initially 1, it is 1 after the trans- 
formation. It is a simple geometrical problem to show that given any ¢ 
such that 0S ¢’ < 2z, it is possible to transform (#,y,¢) into (2, y, ¢’) 
by means of these transformations without passing out of a small neighbor- 
hood of the point (x,y). Thus, if (x, y,@) is an element at which the metric 
density of Z is 1, then the metric density of / is 1 at all the points (z, y, ¢’), 
0 ¢’ < 2x. If ¢’ is chosen properly, the directed geodesic determined by 
(x,y, 6’) will pass through the origin, and thus some point (0,0,¢) of Q is 
a point at which the metric density of / is 1. But then all the points (0,0, ¢), 


242 GUSTAV A. HEDLUND. 


0= ¢ < 2z, are points at which the metric density of # is 1. By reversing 
this process we infer that every (z,y,¢), interior to R, < 2z, 
is a point at which the metric density of FH is 1. It follows that mE = moQ 
and the proof of the theorem is complete. 


UNIVERSITY OF VIRGINIA, 
CHARLOTTESVILLE, VA. 


BIBLIOGRAPHY. 


L. R. Ford, Automorphic Functions, New York, 1929. 
G. A. Hedlund, “On the metrical transitivity of the geodesics on closed sur- 
faces of constant negative curvature,’ Annals of Mathematics, vol. 35 (1934), pp. 
787-808. 

3. “A metrically transitive group defined by the modular group, 
American Journal of Mathematics, vol. 57 (1935), pp. 668-678. 

4. , “The dynamics of geodesic flows,” Bulletin of the American Mathe- 
matical Society, vol. 45 (1939), pp. 241-260. 

5. , “Fuchsian groups and mixtures,” Annals of Mathematics, vol. 40 
(1939), pp. 370-383. 

6. E. Hopf, “ Fuchsian groups and ergodic theory,” Transactions of the American 
Mathematical Society, vol. 39 (1936), pp. 299-314. 


” 


7. , “ Ergodentheorie,” Hrgebnisse der Mathematik und threr Grenzgebiete, 
vol. 5 (1937). 
8. “Beweis des Mischungscharakters der geodatischen Strémung auf 


Flichen der Kriimmung minus Eins und endlicher Oberfliche,” Sitzwngsberichte der 
Preussischen Akademie der Wissenschaften, 1938, pp. 333-334. 
9. H. Seifert and W. Threlfall, Lehrbuch der Topologie, Berlin, 1934. 


| 
t 
] 
8 
| 


THE EULER NUMBER OF A RIEMANN MANIFOLD.* 


By Cari B. ALLENDOERFFR. 


1. Introduction. One of the chief links between the differential 
geometry and the topology of two dimensions is the corollary to the Gauss- 
Bonnet theorem which states: The integral of the total curvature of a two 
dimensional closed surface over the surface is equal to 2¢N, where N is the 
Euler number of the surface. Since the Gauss-Bonnet theorem is of intrinsic 
character, this theorem does not require the surface to be a subspace of any 
Euclidean space. 

An alternative, but less inclusive, proof of this theorem can be given 
which avoids the Gauss-Bonnet theorem and uses instead the property that 
the surface lies in a three dimensional Euclidean space. This proof can be 
generalized to a closed Riemann space #, of even dimension which is a sub- 
space of an » ++ 1 dimensional Euclidean space, i.e., a hypersurface. In this 
case the theorem takes the form: 


(1. 1) Kd0 
Rn 2 

where K is the total curvature of Rn, on is the area of an n-sphere (a sphere 
whose surface is » dimensional), and N is the Euler number of Ry. We recall 
here that K is defined for a hypersurface as the product of the n principal 
curvatures, and that it can be expressed as a polynomial in the Ragys of Rn 
divided by the determinant of the gag. We shall not consider the case n odd, 
for it has been shown that (1.1) does not hold under these circumstances.’ 

The problem at hand is to extend (1.1) to spaces which are not hyper- 
surfaces. If no imbedding is to be assumed this requires a generalization of 
the Gauss-Bonnet theorem to more than. two dimensions, and so far this has 
not been accomplished. Progress, however, can be made by assuming that Ry 
lies in a Euclidean space of n-+ q dimensions, and on ‘this basis we shall 
prove the following 


TuroremM. If a closed Riemann manifold of even dimension can be 
made a subspace of a Euclidean space Eng, then 


f kdO 
Rn 


* Received October 12, 1939. 
For the case of a hypersurface see H. Hopf, “ tber die Curvatura integra geschlos- 
sener Hyperflachen,” Mathematische Annalen, vol. 95 (1925), pp. 340-367. 
243 


4 
| 


244 CARL B. ALLENDOERFER. 


K = Paras: Be Pan an Bn 
n!2"/2 | gag | 


The term “closed Riemann manifold” is used in the sense defined by 


Hopf in a paper in which the background of this problem is discussed and 
the present investigation suggested.” 

The chief difficulty in the preparation of this paper was the definition 
of K since no theory of principal curvatures, etc, exists for spaces other than 
hypersurfaces. Instead K is defined indirectly by the use of the theory of 
tubes recently developed by H. Weyl.* Once this is accomplished our theorem 
is an immediate application of Kronecker’s index theorem* and of Weyl’s 
results. 


2. Kronecker’s index.* An important tool in the proof is an integral 
theorem due to Kronecker, the proof of which is here summarized from the 
present point of view. Let S be an m dimensional closed Riemann manifold 
on which is defined a set of n + 1 functions of class C', V‘(x), which satisfy 
ViVi 1. By means of this set of functions we can consider a continuous 
mapping of upon the unit n-sphere, whose equation is V‘V!' The 
orientations on S and & are those imposed by a fixed orientation in the 
arithmetic space of the parameters 2*. This mapping is of a definite degree 
d, where d is an integer, positive, negative, or zero. We seek an analytic 
expression for d. Consider the determinant: 


yuu 
dx" 
(2.1) D= 
dx” dx” 
It is easy to show that 
avi 
(2. 2) D? == 


2H. Hopf, “ Differentialgeometrie und Topologische Gestalt,’ Jahresbericht der 
Deutscher Math. Vereinigung, vol. 41 (1932), pp. 209-229. Also see Hopf und Rinow, 
“ber den Begriff der vollstandigen differentialgeometrischen Flache,’ Comm. Math. 
Helvet., vol. 3 (1931), pp. 209-225. 

7H. Weyl, “On the volume of tubes,” American Journal of Mathematics, vol, 61 
(1939), pp. 461-472. Readers should note that in Weyl’s paper », refers to the surface 
area of a sphere which incloses a volume of n dimensions. We put w, equal to the 
area of a sphere whose inclosed volume is » + 1 dimensional. 

‘For a full treatment see J. Tannery, Introduction a la Théorie des Functions, 
Note by J. Hadamard, vol. 2, pp. 437-477. 


where 
| 
i 


wth. 


the 


ns, 


THE EULER NUMBER OF A RIEMANN MANIFOLD. 245 


which is recognized as the determinant of the metric tensor of the n-sphere 
if V‘ are taken as Euclidean codrdinates. The area, wn, of the sphere is thus 
given by: 


f = V -da"=d- oy 
integrated over S provided the sign of the radical is chosen from point to point 
to allow for overlapping of the covering. This is accomplished at once by 
using ff Ddzx. For D is numerically equal to VY D? and has a positive sign 

s 
for elements on the sphere of positive orientation and a negative sign in the 
opposite case. Therefore Ddz = on. 


In the special case where S§ is a hypersurface and where V* is the normal 
vector £*, we arrive at the total curvature of S. For 


dy* 
py 
(2.3) bapg 
where bap are the negatives of the coefficients of the second fundamental form 
of S. From the fact that xt Gqb a8 we have that: 
(2.4) D? = | | ; 
or that 
= | bap | 
(2.5 D=e 
) | gap |? 


By considering the special case 


dy’ 


with 1 in the («+ 1)-th place, bag = 8ag3 gag = Sag; it is shown that e is 
definitely + 1, since (2.5) is an identity. But since K, the total curvature, 
equals | bag | /| gap |, this shows that: 


S 


where dO = gag | - - da”. 
Since n is even, it is necessary that NV, the Kuler number of S, be equal to 2d. 
Hence 


JS 


which is the required theorem for this special case. 


3. Fundamental equations on tubes. Let y'=y'i(u) be the para- 


by 
nd 
on 
an 

of 
om 

l’s 

he 

ld 

fy 

he 

he 
ree 
tic 

der 

OW, 

6] 

ace 


246 CARL B. ALLENDOERFER. 


metric equations of Ry in Hn.q for a neighborhood of Rn. Since it may be 
necessary to consider a number of sets of such parameters in order to cover 
Rk, completely, we shall let w* be a typical set. There now exist q mutually 
orthogonal unit vectors o'(o=1--:-q) which are normal to Ry. Let 
these be chosen as functions of class C* and such that the determinant 


i 
> 0. The parametric equations of a tube of unit radius may then 
be written : 
ri y'(u) + 
where 
(3. 2) 1 
and v4 (A =1---q—1) are parameters on a (y—1) sphere. Again sev- 


eral sets of v’s will be needed to cover the sphere. The tangent vectors of this 
tube are then: 


by’ 


The tube is moreover a hypersurface of En.q and its normal vector is then 
%€o'. This follows from (3.3) and the equations: 


We then observe that: 


0 
= — 190% + ; 
(3. 4) 
0 .. 
( 
Hence for the tube the D of Kronecker’s index is 
1 row 
ar, | 
(3. 5) D= q—1 rows 
170% + tvpcsaép* | 
where 0%) —%qy. And at once it follows that 
ate 
(3. 6) D? = | |X a » 


whose square root gives: 


(3. 7) Dae 


5 For instance see L. P. Eisenhart, Riemannian Geometry, p. 189, equations (56.3). 


is 
I 
h 
ap | ( 


THE EULER NUMBER OF A RIEMANN MANIFOLD. 247 


ate ate 
Ava |" 
sphere, 3. The value of e here depends essentially on the orientation chosen 
on R, and on the sphere. In order to decide this matter consider the special 
case where: = (0,: with 1 in the o-th place only; 
-.0,1,0,: +, 0) with 1 in the « + q-th place only; = Sag; 


We note that V/¢ is the surface element of a q—1 


where ¢ = 


74g = 0 for oA 1; = 0. Then (3.7) becomes term by term : 
| 
avi 
7 0 
n 
at ata | Viv 
0 


The value of the upper left-hand minor is + Vt. We choose the positive 
orientation on = so that the sign of the radical is positive at all times. This 
shows that e = + 1, since (3.7) is an identity. Remembering to perform 
all integrations in the thus determined positive sense, we have that 


a6 | 
Rn, = | Jap | 2 


where N is the Euler number of the tube, provided that n + q—1 is even. 
We have assumed that n is even, and hence require q to be odd. The case q 


even will be handled presently. 


4, Final results. Here we follow closely the results of Weyl’s recent 
paper in evaluating the integral on the left. Since 


(4.1) I =f. | ap | dv 
[gael 


is an orthogonal invariant with respect to the index a, and since n is even, 
I is expressible as a polynomial in 5 = Following Weyl we 
have that: 
(4.2 k 


ye 
er 
ly 
at 
t 
is 


248 CARL B. ALLENDOERFER. 


where 


n! | Jap 


Because of the relation: Ragys = Lay/ps — Faspy, We may write: 


| | Jap | 


(4. 3) K 


Thus: 


4 Noniq-1- 


Recalling that q is odd we have that 


9 (2a) "/? 


Wq- 1 


on 


and hence that 


Combination of (4.4) and (4.5) gives: 


N wm 


Rn 2 2 


Now we know from topology that N == 2N where N is the Kuler number 
of Ry». For the tube is topologically the product of Ry» with a y— 1 sphere 
where g —1 is even. This leads at once to the above relation between their 
Euler numbers. Thus we have that for n even and q odd: 


(4. 7) f, K VV 
Rn Rn 2 


This result extends immediately to the case q even. For if q is originally 
even, imbed the n+ q dimensional Euclidean space in a similar space of 
n-+q-+1 dimensions so that the parametric equations of PR, are yt = y‘(u), 
i=1---n+q; y"% =constant. Now the proof proceeds as above, yield- 
ing the desired result, since g does not appear in the final formula whatsoever. 


HAVERFORD COLLEGE. 


at t 


¢ 
| 
ti 
al 
| in 
th 
R 
W 
(1 
wh 
pol 
cut 
arc 
(1, 
wh 
| Biv 
|_| 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS 
PROBLEM IN WHICH THE INTEGRAND 
IS DISCONTINUOUS.* 


By Nancy Cote. 


Introduction. The purpose of this paper is to establish Morse’s Index 
Theorem * for a problem in euclidean m-space in which the integrand is dis- 
continuous and in which the basic curve g is a broken extremal with a finite 
number of corners. We assume that at each corner g is cut across by a regular 
(m——1)-manifold of class C*, which is not tangent to either are of g at the 
corner, and that at. each corner g satisfies a set of “ primary incidence rela- 
tions.” Our integral J along g will be of the form 


f F'(a,%)dt +--+ (ax, x) dt 


Ih 


where g:,° * *, 9x indicate the extremal arcs of which g is composed. Mason 
and Bliss (see Bliss 1) discussed the minimizing properties of a broken extremal 
ina problem with a discontinuous integrand in 2-space. Miles (1) extended 
their results to 3-space. In order to keep the notation as simple as possible, 
we treat below the case k = 2 in m-space. 


1, General hypotheses. Let FR be an open region in the space of the 
variables (2) = (2',---,a™). Let g be a simple continuous curve lying in 
Ff and composed of two successive regular ares g; and gz, each of class C?. 
We shall represent g in the form 


(1.1) = y'(t) 


where ¢ is the arc length and increases from (’ to /” inclusive. 

Let c, where t’ << c < t”, represent the value of the parameter / at the 
point of intersection of g; and gz. We suppose that at the corner tc, g is 
cut across by a regular (m-—-1)-manifold M which is not tangent to either 
are of g at fc. We assume that M is representable in the form 


where the functions z‘(@) are of class C? for (a) near (0), and for («) = (0) 
give the point 4c on g. We term M the deflecting manifold. 


* Received June 22, 1939. 
*See Morse 2. Numerals following the name of an author refer to the bibliography 
at the end. 
249 


250 NANCY COLE. 


By a function of class C* in a domain S which is not entirely open, we 
shall mean a function of class C* in an open region which contains the domain 


8 in its interior. 
In the space of the variables (x) let R, (22) be a domain of points (z) 
near g: (gz) excepting those points (2) not on M which lie on the same side 


of M as g2 (gi). Let 

be a function of class C* for (2) in R, and (r) any set not (0), and let 
F?(a,r) = F?(a2',- -,a™, +, 

be a function of class C* for (~) in FR, and (r) any set not (0). We assume 


that each function F*(z,7r), x = 1 or 2, is positive homogeneous of order 1 
in the variables (r) ; that is 


(1.3) F*(2, kr) = kF*(z, r) 
for all numbers k > 0 and (7) # (0). We assume also that the problem is 
positive regular along g; that is 

F',+,3(z, > 0, (4,7 =1,-- +, m) 


F?,+,4(a, > 0 


for (x,r) = (y, 7) respectively on g; and g2 and for (A) any set not (0) 
and not proportional to (7) on g; and g» respectively. 

A curve of class D' neighboring g will be termed admissible if it joins 
the initial point t of g to a point (z) on M and that point (z) on M to the 
final end point ¢” of g, and crosses M just once. 

For our problem the familiar integral J defined along an admissible curve 
y of class D’ neighboring g will be of the form 


where <* stands for the derivative of x‘ with respect to the parameter ¢ and 
where y; and, yz denote the arcs of y lying in FR, and RP, respectively. 
We assume that the Euler equations hold along g; that is the Huler 


equations 

— = 0, (4 m) 
(1.5) 

F F 


hold along the ares g; and g, respectively. A simple continuous curve which 


\ 


e 


le 


d 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 251 


lies in R, or 2 or both and which is composed of a finite succession of regular 
arcs of class C* along which the Kuler equations hold will be termed a broken 
extremal. 

If h*(¢) is any function of ¢ defined for ¢ near tc on g, we shall 
represent the left and right limits of h‘(t) at tc, provided they exist, 
by hé- and h** respectively. 

If the point cc on g is denoted by P, by points near P on the negative 
(positive) side of M we shall mean points lying in R, (R2) near P. 


2. The primary incidence relations. It is easy to prove that a neces- 
sary Condition that g afford a weak minimum to J relative to neighboring 
admissible curves of class )* is that the directions of g at tc satisfy the 
following conditions 


(2. 1) (y, — (y*, y*) (0) = 0, 


where the subscript A indicates differentiation with respect to a". From now 
on we shall assume that the directions of g at t =c satisfy (2.1). 
Consider the condition 


(2.2) [F',+(2, — r*) = 0, (i= m) 


where (z) is given by (1.2), where (7~) and (7*) denote directions near 
(y) and (+*) respectively, where the differentials dz‘ are to be expressed in 
terms of the differentials de" using (1.2), and where (2. 2) is to be regarded 
as an identity in these differentials for (#) near (#) = (0). 

For a point (z) on M near («) = (0), the condition (2.2) may be 
written in the form 


where the subscript h indicates differentiation with respect to «". The con- 
ditions (2. 3)’ will be termed the primary incidence relations at a point (2) 
on the deflecting manifold M. 

A broken extremal which is composed of two successive extremal arcs 
lying in R, and R, respectively will be termed an extremaloid if its directions 
at the corner (z) on M satisfy the primary incidence relations (2.3)’. We 
note that g is an extremaloid which satisfies the primary incidence relations 
with (a) therein equal to (0). 

If to the conditions (2. 3)’ we adjoin the condition 


) 
A 
1 
) 
8 
| 
r | 


252 NANCY COLE. 


the m conditions (2.3), considered as equations in the variables (r*), have a 
unique solution r‘* = ri+(a, 7), where r are functions of class (? for (a) 
near (0) and (7-) near (y~). That such a solution exists follows from the 
fact that g satisfies the primary incidence relations and from the fact that for 
(%) (0) the functional determinant of the left members of the system 
(2.3) with respect to the variables (7*) is not zero. To prove the latter fact 
we use the method of Bliss (see Bliss 2, p. 447) to show that the m-square 
functional determinant 


2,1 F | (a)=(0) 
(2. 4) 


is equal to + B- F,, where 


1 0 
0 y" 
0 Zn” 


The determinant B is not zero since the regular manifold M is not tangent to 
g2 at tc. That F, is not zero is a consequence of the positive regularity 
hypothesis (1.4). See Morse 1, p. 112. 

We state the following theorems. 


THEOREM 2.1. Given a point (z) on M near (a) = (0) and at this 
point a direction (1-) near (y-). There is a unique extremal on the positive 
side of M which issues from the point (z) with a direction (r+) determined 
by the primary incidence relations, and along which the parameter is the arc 
length. 


THEOREM 2.2. An n-parameter family of exlremals defined for |Se 
and intersecting M for t =c determines an n-parameter family of exlremals 
defined for t=c which with the respective extremals of the given family 
salisfy the primary incidence relations at t =c, and along which the parameter 
is the arc length. 


Such a family defined for ¢=c is called the continuation of the given 
family defined for tc. The two families of extremals form a family of 
extremaloids. 

Similarly a family of extremals may be defined in terms of a family of 
extremals which is defined for ¢ on g. and intersects M for ¢ =c and its 
continuation family for ¢ on 9. 


| 
| 
‘ 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 253 


3. Conjugate points. Let ¢° be a value of t, ’ St <c, and let (y°) 
denote the direction cosines of g; at the point for which t= ?°. Let K denote 
the unit (m—1)-sphere with center at the origin. Let (B',- --,8™*) be 
the parameters in a regular representation of K in the neighborhood of the 
point (7°), with (8) = (0) corresponding to (7°). The family of extremals 
issuing from 7° with directions determined by (8) can be represented in the 
form 
(3. 1) = (7, B") = 9$*(7, 8), 


where 7 is the arc length and (8) = (0) gives g:. The functions ¢‘ and ¢,4 
are of class C? for (8) near (0) and zon g,. The zeros, r ~ 1°, of the jacobian 


D(¢',: 
3. = = (0), 
(3 2) (7, ) D(+, (B) ( ); 
of the family (3.1) are termed the conjugate points on g, of the point 1° of 
g:. The order of vanishing of D(r, 1°) at a conjugate point 7 of 1° is termed 
the order of that conjugate point. 
Since g, is not tangent to M at tc, the equations 


B) = (1—1,---,m) 


have a solution of the form 


7=1(8), 
(3.3) at gh(B), (hk =1,---,n=m—1) 


where 7(8) and @"() are functions of class C? for (8) near (0), and where 
7(0) =c, a*(0) =0. Geometrically this means that the extremals of the 
family (3.1) intersect Mf for (8) near (0) in the space (2). 

In order to define the conjugate points on gz of the point /° of g, it is 
convenient to represent the family (3.1) in the form 


(3. 4) = $'(7, 8B) = y' B) m) 


where 


r(B) —P 


and 


B) = 

y'(c, B) =2'[a(B) ]. 
Such a change of parameter is admissible, and we note that (8) = (0) gives 
9: for the family (3.4) and that along g, the parameter ¢ is the arc length. 
Moreover the jacobian 


J 
‘ 


254 NANCY COLE. 


D m 
(3. 5) Dil) (8) = (0), 


of the family (3.4) vanishes if and only if the jacobian (3.2) vanishes, and 
it vanishes to the same order. Thus the conjugate points on g, of the point 
t° of g; are defined by the zeros, t ~ 1°, of D,(t, ¢°), and their orders, by the 
order of vanishing of D,(t,¢°) at the respective points. 

From Theorem 2. 2 it follows that there exists a family of extremals on 
the positive side of M which issue from the points (z) on M near (a) = (0), 
with directions r‘* determined in (2.3) by the directions r‘ of the family 
(3.4), and along which the parameter is the arc length. These extremals 
represent the unique continuations of the respective extremals of the family 
(3.4), and will be represented in the form 


(3. 6) ri'=y'(t,B), (t On gz) 
where ¢ is the are length, where (8) = (0) gives gz, and where 
(c, B) =2*[a(B) ] (1=1,: m). 


The functions y* and y+‘ are of class C* for ¢ on g. and (8) near (0). The 
zeros, t 4c, of the jacobian 


D(y', 
O) == 
(3.7) Dit?) = = (0): 
of the family (3.6) are termed the conjugate points on g» of the point ¢° of 
g:. The order of vanishing of D.(t,/°) at a conjugate point ¢ of ¢° is the 
order of that conjugate point. 

We shall find it convenient to refer to the family of extremaloids 


(3. 8) == y'(t, B) (¢ on 


which is defined by (3.4) and (3.6) for ¢ on g, and gz respectively. 
Conjugate point determinant. We set 
D,(t, t°) = D,(t, t°), (<tc) 


(3. 9) D,(t, t°) — dD. (t, rs, (c 


understanding that (8) —(0) therein, and term D,(t,¢t°) the conjugale 
point determinant. The zeros, t4 1°, of Dy(t, t°) define the conjugate points 
on g of the point ¢° of g,. The conjugate points of /° and their orders are 
independent of admissible changes of parameter. 

For any point on gz the conjugate point determinant D,(t, 
is defined in a similar fashion, using the family of extremaloids issuing from 
the point ¢° with directions near the direction of g, at 1°. In so doing it is 
understood that the corner point ¢ —c is considered as a point of 4. 


TILE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. — 235 
Finally if #® —c, the conjugate points on g of the point tc are the 
zeros, Ac, of the conjugate point determinant D,(t,c) defined in terms 
of the family of extremaloids with a corner at the point tc on g, with 
directions +* determined as in (3.1) by the parameters (8) in a regular 
representation of K in the neighborhood of the point (y~), (8) = (0) corre- 
sponding to (j-), and along which the parameter ¢ is the are length. The 
order of vanishing of D,(t,c) at a conjugate point is the order of that 
conjugate point. 
4, The second variation. Let 
== a*(0) =0, (hk +,n=m—1) 
be a set of m functions of class C? for e near 0. Let 
(4. 1) = x(t, 


be a 1-parameter family of admissible curves for which the functions 2‘(t, e) 
are of class C? for ¢ on g; and e near 0 and for ¢ on gz and e near 0 respec- 
tively, which contains g for e 0, and which satisfies the identities 


(4.2) z*(c, e) =2'[a(e) 


For each value of ¢ near 0, the integral J evaluated along the admissible curve 
determined by e is a function J(e) of class C?. We obtain a formula for the 
second variation in which we set 
(x = 1,231,j7 

where the arguments of the partial derivatives of F“ are (x,r) = (y,7) on gk. 

Since g satisfies the primary incidence relations at tc, the second 
variation takes the form 


J"(0) + f° 9) dt + 20*( 9) dt, 
t’ 
where ° 


and where and the n constants are respectively the variations xe‘ (t, 0) 
and a,"(0) and satisfy the secondary end conditions 


(4. 4) =0, f(t”) = 0, (i= m). 
and the secondary corner conditions 

(4, 5a) ni = 2n'(0) 0", 
(4, 5b) nt = (0a. 


256 NANCY COLE. 


5. Solutions of the Jacobi equations. The Jacobi equations for ¢ on 
gi and are 


d 
(5.1) — = 0, 


where x = 1 and « =2 respectively. Throughout this section we shall assume 
that « is fixed; that is, x is either 1 or 2, not both. 

It is well-known that the Jacobi equations for ¢ on gx are satisfied 
identically by tangential solutions of the form 


p(t)y*(t) > 


where p is an arbitrary function of ¢ of class C? for ¢ on gx. If, for t = *, 
a solution of the Jacobi equations 7‘(¢) satisfies the relation 


nf (t*) — p(t*)y'(t*) = 0, 


we shall say that ‘(¢) vanishes modulo a tangential solution at ¢=(*. Ifa 
solution of the Jacobi equations for ¢ on gx is determined except for the 
possible addition of a tangential solution, we shall say it is determined modulo 
a tangential solution, or more briefly, mod 7. We seek conditions by which 
solutions of the Jacobi equations for ¢ on gx are determined mod 7’. 

Since the determinant of the coefficients of 7‘ in (5.1) is zero, in order 
to obtain solutions of the Jacobi equations for ¢t on gx we consider the auxiliary 


differential equations 


(5.2) — — 0, (j—1,- +, m) 
de 0, 


for t on gx. Cf. Bliss 3, p. 199 and Graves 1, p. 17. To solve the m+1 


equations (5.2) we introduce the system 


— + = 0, 
(5. 3) z 


dt? = 0, 


where A is an unknown function of ¢ of class C? for ¢ on gx. Using the method 
of Morse 1, p. 124, it is easy to prove that A= 0 in solutions of (5.3). Thus 
(5. 3) may be regarded as identical with (5.2). Hence the 7 in (5.2) can 
be expressed as linear homogeneous functions of the variables (4,7) with 
coefficients which are of class C1 in ¢ for ¢ on gx. 

The most general tangential solution of the auxiliary differential equations 
(5. 2) is of the form 


— 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 257 


(a + bt)y*(t), 


where a and 6 are constants. 
We shall prove the following lemma. 


LeMMA 5.1. Any solution of the Jacobi equations for t on gx may be 
written as a solution of the auxiliary differential equations for t on gx plus 
a tangential solution of the Jacobi equations for t on gx. 


Let (7) be any solution of the Jacobi equations for ¢ on gx. Consider 
the difference 


y'(t) — p(t)y*(¢) 


where p(t) is a function of ¢ of class C? for ¢ on gx. The difference y‘(t) 

is a solution of the Jacobi equations for ¢ on gx. If we choose p(t) so that 

(y’y’) = 9, (7 

dt? 

then we have 


(5. 4) 


and y‘(t) is a solution of (5.2), as was to be proved. 
We shall prove the following lemma. 


LEMMA 5.2. <A solution of the Jacobi equations for t on gx which 
vanishes with its derivative at a point, is identically equal to a tangential 
solution of the Jacobi equations for t on ge. 


Let (7) be any solution of the Jacobi equations for ¢ on gx which 
vanishes with its derivative at a point ¢=¢*. Consider the difference 


(5. 5) wi(t) = nt(t) — p(t) (2), 

where p(t) is a function of class C? for ¢ on gx which satisfies (5.4) and 

where p(¢*) = =0. Then w‘(t) is a solution of (5.2) which vanishes 

with its derivative at t=1*. Hence w(t) =0. It follows that y‘(¢) is 

identically equal to a tangential solution of the Jacobi equations for ¢ on ge. 
6. The secondary incidence relations. The secondary problem is non- 

parametric in the space of the variables (7,- --,7”,1). The n-plane 

N ni =ai(0)o", t=—c 


is the analogue of the deflecting manifold M and will be referred to as the 
deflecting plane N. The deflecting plane N is regular by virtue of the fact 
that the rank of the matrix |] z,‘(0)|| is n. The only tangential solutions of 


258 NANCY COLE. 


the Jacobi equations or of the auxiliary differential equations for which 
tc gives a point on N, are those tangential solutions which vanish at ¢t =c. 

In the space (yn, t), a solution of the Jacobi equations which is of class (? 
for ¢ on g; and gz respectively, and which for tc has a corner on N is the 
analogue of a broken extremal in the space (x) with fixed end points and a 
single corner on the deflecting manifold M. 

We shall next define the secondary incidence relations. To that end let 
(w) be an arbitrary set of » = m— 1 constants and e a parameter neighbor- 
ing e=0. Consider the 1-parameter family of extremaloids 


(6.1) = x'(t, €) (t= 1,---,m) 
iletermined by setting = ew" in the family of extremaloids (3.8). The 
family (6.1) satisfies the following identities in e: 

(6. 2) x'(c,e) =2'[a(eu) ]. 


The variations z,‘(t,0) of the family (6.1) will be denoted by 7‘((). 
Differentiating the identities (6.2) with respect to e and setting e = 0 yields 


h 
(6.3) = 2*(0) 
where 
dat 
de 


For the family (6.1) the primary incidence relations (2.3)’ reduce to 
a set of n identities in e. Upon differentiating these identities with respect 
to e and setting e = 0, we obtain 


k 


where bn is given by (4.3) and where 


Setting da*/de = w", (6.3) becomes 

(6. 6) = 

and (6.4) becomes 

(6.7) + (0) = 0. 


We term (6.7) subject to (6.6) and (6.5), the secondary incidence relations. 
It is understood that the independent variables in (6.7) are (w), (77) and (7). 

If y‘(t) is a solution of the Jacobi equations with a corner on N for 
tc, and if its slopes 7‘ and #‘* with the set (w) determined by (6.6) 


«a 


fe 


| 
t 
d 
k 
| 
W 
(€ 
th 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 259 


satisfy the secondary incidence relations (6.7), then y‘(t) will be said to 
satisfy the secondary incidence relations. 

In order to solve the secondary incidence relations for the variables (7*) 
in terms of the remaining variables, we adjoin the condition 


to the relations (6.7). The m conditions 


+ = 0, (1,7 =1,- mshi, -+,n) 
(6.8 
are called the restricted secondary incidence relations. Since the problem is 
positive regular along g2, and since the regular manifold M is not tangent to 
g2 at t =c, the restricted secondary incidence relations (6.8), considered as 
equations in the variables (7*), have a unique solution (#*) expressible in 
terms of the remaining variables. 

We have the following lemma. 


LemMA 6.1. Corresponding to any solution of the auxiliary differential 
equations for t on g, which for t =c intersects the deflecting plane N at the 
point (w), there exists a unique solution of the auxiliary differential equations 
for t on gz which for t =c gives the point (w) on the deflecting plane N and 
at (w) has a slope uniquely determined by the restricted secondary incidence 
relations. 


Such a solution is called the continuation for t on gz of the given solution 
for t on gy. 

Returning to the problem of expressing the variables (4*) of the sec- 
ondary incidence relations (6.7) in terms of the remaining variables, we state 
the following lemma. 


Lemma 6.2. Corresponding to sets (4°) and (w), there is a set (n°) 
determined except for the possible addition of a set of the form (ky*) where 
k is a constant, by the secondary incidence relations (6.7). 


The proof of the lemma is based on the following statements. 

(a). If the variables (7) with sets (7) and (w) satisfy the secondary 
incidence relations (6.7), then the set (#* + ky"), where & is a constant, 
with the same sets (7) and (w) satisfies the secondary incidence relations 
(6.7). 

(8). Any two sets (7*) and (4°) which with sets (7°) and (w) satisfy 
the secondary incidence relations (6.7) differ by a set of the form (ky*), 
where & is a constant. 


| 


260 NANCY COLE. 


Statements («) and (8) may be readily verified by direct substitution. 

The following theorem is a consequence of Lemma 6.2 and the fact that 
the only tangential solutions of the Jacobi equations which for ¢t =c define 
a point on the deflecting plane N are those which vanish at t —c. 


THEOREM 6.1. Corresponding to any solution (yn) of the Jacobi equa- 
tions for t on g, which for tc defines a point (w) on the deflecting plane N, 
there exists a solution of the Jacobi equations for t on g2 which for t=c 
gwes the point (w) on the deflecting plane N and which at t =c has a slope 
determined except for a constant mulliple of y** by the secondary incidence 
relations (6.7). This solution is unique modulo a tangential solution of the 
Jacobi equations for t on gz which vanishes at t =c. 


Such a solution of the Jacobi equations is termed a continuation for t on 
gz of the given solution for ¢ on 4,. 


CoROLLARY 6.1. A tangential solution of the Jacobi equations for t on 
g: has continuations for t on g2, if and only if it vanishes at tc. Moreover 
its continuations for t on ge are arbitrary tangential solutions which vanish at 


We state the following theorem. 


THEOREM 6.2. The continuations for t on go, if they exist, of any two 
mutually conjugate solutions of the Jacobi equations for t on g, are mutually 
conjugate in the sense of von Escherich. 


Let »* and 7‘ be any two solutions of the Jacobi equations for ¢ on 4 
which are mutually conjugate in the sense of von Escherich; that is, suppose 
the identity 


(6. 9) — = 0 (i=1,--+,m) 


holds for t on g:. (See Bolza 1, p. 626). We assume that the continuations 
of () and (7) do exist. Making use of the fact that the given solutions () 
and (7) and their respective continuations satisfy the secondary incidence 
relations, it is easy to prove that the continuations of (7) and (7) are mutually 
conjugate in the sense of von Escherich. (Cf. Morse 1, p. 52.) 


7. The determinant A(f,t®). Jn this section we shall assume, unless 
otherwise specified, that a point ¢? on g, means a point ¢° such that 
t’=t9<c. Recall the conjugate point determinant D,(t,¢°) defined in 
(3.9). Let the (p+ 1)-st column of D,(t,t°) be represented in the form 


(7.1) np'(t), 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 261 


and let the first column be multiplied by (¢— ¢°)(t—c) so that it will be 
a tangential solution of the Jacobi equations which vanishes at ¢ = ?¢° and is 
continuable at tc. The determinant A(t, ¢°) is defined as follows: 


A(t, y(t) |. 


For ¢t° and ¢-€c, the determinant A(/,/°) vanishes if and only if the 
conjugate point determinant D,(t, 1°) vanishes, and to the same order. Hence 
the zeros, tA l° and tc, of A(t,t°) define the conjugate points on g of 
the point ¢° of g,, and the order of vanishing of A(t, t°) at a conjugate point 
defines the order of that conjugate point. For tc, A(t, ¢°) always vanishes. 
The point ¢ of g will be conjugate to the point /° of g, if and only if the 
order of vanishing of A(c, ¢°) is greater than 1, and the order of ¢ as a con- 
jugate point of #° will be 1 less than the order of vanishing of A(c, 1°). 

The variation »p‘(¢) representing the (p + 1)-st column of A(t, ¢°) is a 
solution of the Jacobi equations determined by g. The variation p‘(t) is, 
moreover, precisely the variation xe‘(¢,0) of the family (6.1) when uw? = 1 
and the other n — 1 w’s are null. Since the secondary incidence relations are 
linear in all the variables, we have the following theorem. 


THEOREM 7.1. The combination wWypi(t) of the last n columns of 
A(t, 1°) is a solution of the Jacobi equations which satisfies the secondary 
incidence relations (6.7%), provided (w) therein is taken as 


o* = — yp 


We shall prove the following theorem. 


THEOREM 7.2. The m columns of the determinant A(t, t°) represent m 
linearly independent solutions of the Jacobi equations for t on g, and the last 
n columns represent solutions which are linearly independent of tangential 
solutions. 


That the columns of A(t, ¢°) are linearly independent for ¢ on g, follows 
from the fact that A(t, /°) does not vanish identically for ¢ sufficiently near 1°. 
Suppose first then that the last n columns are linearly dependent upon a tan- 
gential solution for ¢ on g,; that is, suppose m constants (c), not all (0), 
exist so that 
(7.2) (t) = p(t) 


where p is a function of ¢ of class C* for ¢ on g;. The function p cannot be 
identically zero for ¢ on g,, for that would imply the linear dependence of 
the last n columns of A(¢, 1°) for t on g;. But since (c) A (0) and p¥0, 


262 NANCY COLE. 


the identity (7.2) implies that A(t, ¢°) vanishes identically for ¢ on g;. From 
this contradiction we infer that the last n columns of A(t, ¢°) are linearly 
independent of tangential solutions for ¢ on g;. 

It remains to prove that the theorem is true for ¢ on go. Next suppose 
that the m solutions of the Jacobi equations represented by the columns of 
A(t, ¢°) are linearly dependent for ¢ on gz; that is, suppose that constants 
(— ¢n), not all zero, exist so that the identity 

Conp'(t) = (t= +, m;p—1,-- -,n) 
holds for ¢ on gs. The constants (c) cannot all be null, for cp = 0 for each 
p would imply that d=0. By Theorem 7.1 the solution cynp‘(t) for ¢ on g, 
is a continuation of Cyyp'(t) for ¢ on g;. On the other hand since cynp‘(t) 
for ¢ on g2 is identically equal to a tangential solution which vanishes at / =c, 
it must be a continuation of a tangential solution which vanishes at /=c. 
Hence for ¢ on g; we have Cynp'(t) + p(t) y(t) where p(t) is a function 
of class C? for ¢ on g, and where p(c) =0. Now since (c),~ (0), this 
implies the linear dependence of the last n columns of A(¢, 1°) on a tangential 
solution for ¢ on g,. From this contradiction we infer the linear independence 
of the m columns of A(t, 71°) for ¢ on gp. 

The proof that the last n columns of A(¢,¢°) are linearly independent 
of tangential solutions is similar and will be omitted. 

That the following theorem is true for ¢ on g; may be verified by substi- 
tution in (6.9). That it is true for ¢ on g. follows from Theorem 6. 2. 


THEOREM 7.3. The columns of A(t, 1°) represent mutually conjugate 
solutions of the Jacobi equations. 


We shall prove the following theorem. 


THEOREM 7.4. <A necessary and sufficient condition that a point t=1* 
once <t*SU” be conjugate to a point on <e is that there 
exist a solution of the Jacobi equations which vanishes at t = t° and t=1"*, 
which satisfies the secondary incidence relations at t =c, and which is nol 
given by a tangential solution of the Jacobi equations. 


If the point t = ¢* is conjugate to the point t = ?°, then A(¢, ¢°) vanishes 
for t= ¢*. There exists then a proper linear combinations (w) of the 
columns of A(¢*,¢°) which vanishes. Let (d,¢:,- --,¢n) denote the con- 
stants in this linear combination. First, I say that c, is not zero for each p, 
for (c) = (0) would imply d=0 which is impossible. The solution (w) 
defines a solution of the Jacobi equations which vanishes at t = 1° and t=?, 
which satisfies the secondary incidence relations at tc, and which is not 
given by a tangential solution. 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 263 


Conversely let 7*(¢t) be a solution of the Jacobi equations which vanishes 
at f= and which satisfies the secondary incidence relations at 
tc, and which is not given by a tangential solution. Consider the difference 


(7.3) —d(t—t°) (t— ec) y*(t) — 


where is given by (7.1) and +,¢n) are constants. The dif- 
ference (7.3) is a solution of the Jacobi equations which vanishes at t =?°. 
Moreover, by virtue of the fact that the determinant 


| 


is not zero, the constants in (7.3) may be chosen so that the derivative of 
(7.3) also vanishes at t= 1°. Consequently for ¢ on g,, we have 


(7. 4) (1) — = 0, (c) ~ (0), 


modulo a tangential solution which vanishes at /=?° and t==c. The solu- 
tion represented by the left and right members of (7.4) must have the same 
continuations for ¢ on gs. A continuation of the left member of (7.4) is 
7i'(t) — Cpyp'(t), and all the continuations of the right member are tangen- 
tial. For ¢ on gs, then, we have 7%'(t) — cpypt(t) =0, modulo a tangential 
solution which vanishes at t —c. 

Since 7'(¢*) =0, it follows that for some constant k, Cpmp*(t*) 
—ky'(t*) =0. But (c) (0), so that A(t, must vanish at t= 
and the point ¢ = %* is then conjugate to the point /=?°. The proof of 
Theorem 7. 4 is complete. 

A necessary and sufficient condition that a point t=?¢* on ¢St* Sc 
be conjugate to a point t=? on ¢ St° <c is that there exist a solution of 
the Jacobi equations which vanishes at t= ¢° and ¢ = 1/*, which satisfies the 
secondary incidence relations at tc, and which is not given by a tangential 
solution of the Jacobi equations. 

Consider the m-square determinant 


A(t) = (t—c)¥‘(t)  wp*(t)| 


in which the last n columns are solutions of the Jacobi equations which vanish 
at the point t = ¢° on g,, which satisfy the secondary incidence relations at 
t==c¢, and which are linearly independent of tangential solutions. 


Lemma 7.1. The order of vanishing of 6(t) at any point t=b on g 
is equal to the nullity v of 0(b). 


The proof of this lemma follows the method of Morse in (2), but slight 


ym 
se 
of | 
¢, 
n 
is 
al 
t 


264 NANCY COLE. 


modifications are necessary since we are using solutions of the Jacobi equations 
in place of solutions of his “ restricted Jacobi equations.” 

Let 6 be any value of ¢ on g except ¢° and c. Let r be the rank of 6(b). 
Then r >0, and v=m—r. We suppose that the last n columns of 6(b) 
have been reordered so that the rank of the first + columns is r. We also 
suppose that the rank of the last v columns is zero; for if it were’ not zero, 
it could be made zero by adding suitably chosen linear combinations of the 
first r columns to the remaining columns. Understanding that this has been 
done, let 


uni(t), 
represent the first r and last v columns respectively of 6(t). 


Applying the integral form of the law of the mean to the elements in 
the last v columns of @(¢) yields 


(7. 5) 6(t) = (t—b)”B(t) 


where the function B(t) is continuous in ¢ for ¢ on g, and for ¢ on g 
respectively, and where 
B(b) =| uni(b) |]. 


The lemma will follow from (7.5) if B(b) #0. 

Suppose that B(b) =0. There will exist then a proper linear combina- 
tion (w) of the columns of B(b) with coefficients —d,,° ,— dy) 
such that (w) = (0). Moreover the constants d, cannot all be zero. For 
d,; = 0 for each k would imply that = 0 for each 7, and the rank of 
6(b) would be less than r. We set 

wt(t) = cpuntt, vt(t) = dyvy*(t) 

Hence 
(7. 6) ut(b) =v'(b), vi(b) =0. 
We note that u‘(b) cannot be zero for each i. For that would imply 
v'(b) =v‘(b) =0 for each i, and hence that (v) is a tangential solution 
which vanishes with its derivative at t=. That this is impossible follows 
from the hypothesis that the last n columns of 6(¢) are linearly independent 
of tangential solutions. 

Making use of (7.6) and of the fact that w‘(t) and v‘(t) are mutually 
conjugate for ¢ on g, we obtain 


y(b) ]o*(b) = 0, 
where « is 1 or 2 according as t= lies on g; or gs. It follows from (1. 4) that 


| 
| | 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 265 


But (7.6) and (7.7) imply that (v) is identically equal to a tangential 
solution which vanishes at tb. This is impossible since the last n columns 
of 6(t) are linearly independent of tangential solutions. We conclude that 
B(b) is not zero, and the lemma follows from (7.5) when bA?7° and be. 

The lemma is true when ) = ?/° and bc, but the proof is slightly 
different. 

Since the determinant A(t, t°) satisfies the conditions imposed on @(t) 
in Lemma 7.1, we have the following theorem. 


THEOREM 7.5. The conjugate points on g of a point t° of g, are isolated 
and possess orders equal to the nullity v of A(t, t°) at the respectwe zeros 
tl? and t~Ac of A(t, t°). The order of as a conjugate point of 

=?° is y— 1. 


Since any solution of the Jacobi equations which vanishes at ¢ = ?° and 
satisfies the secondary incidence relations at t= c can be written as a linear 
combination of the last n columns of A(t, ¢°) and a tangential solution which 
vanishes at t = ¢° and fc, we have the following theorem. 


THEOREM 7.6. If a point t=t* once <t* St” is conjugate to a point 

on’ <c, the maximum number of solutions of the Jacobi equa- 
tions which vanish at t = t° and t = t*, which satisfy the secondary incidence 
relations at t =c, and which are linearly independent of tangential solutions, 


equals the order of the conjugate point. 


If a point on Si* Sec is conjugate to a point t= on 
St’ <c, the maximum number of solutions of the Jacobi equations which 
vanish at f= ?° and ¢ = ¢*, which satisfy the secondary incidence relations 
at ¢==c, and which are linearly independent of tangential solutions, equals 
the order of the conjugate point. 

Throughout this section we have assumed that 1° c is a point of 9. 
Corresponding theorems hold if ¢° is a point on gz for which c< PS”. 

If c, we define A(t#,c) as the determinant obtained by multiplying 
the first column of the conjugate determinant D,(t,c) by (t—c). For te, 
A(t,c) vanishes if and only if D,(t,c) vanishes and to the same order. The 
m columns of A(t,c) represent m linearly independent, mutually conjugate 
solutions of the Jacobi equations which vanish at tc and satisfy the sec- 
ondary incidence relations with (w) = (0) therein. Moreover the last n 
columns represent solutions which are linearly independent of tangential 
solutions of the Jacobi equations. A necessary and sufficient condition that 
a point ¢ = ¢* on g be conjugate to the point t —c is that there exist a solu- 
tion of the Jacobi equations which vanishes at t= c and ¢ = ¢*, which satisfies 


3 


_ 
= 


266 NANCY COLE. 


the secondary incidence relations at t = c, and which is not given by a tan- 
gential solution of the Jacobi equations. Furthermore the conjugate points 
on g of the point / =c are isolated and possess orders equal to the nullity of 
A(t,c) at the respective zeros {4c of A(t,c). The order of vanishing of 
A(c,c) is equal to the nullity m of A(c,c). Finally, if a point t = /* of g 
is conjugate to the point = c, the maximum number of solutions of the 
Jacobi equations which vanish at tc and ¢ —{¢*, which satisfy the sec- 
ondary incidence relations at tc, and which are linearly independent of 
tangential solutions, equals the order of the conjugate point. 


8. The index theorem. Let 
be a set of values of ¢ such that 

and such that no one of the A + 1 segments into which g is divided by the 
points of (8.1) contains a conjugate point of its initial end point. 

Let Mo be a regular (m— 1)-manifold of class C? which intersects g at 
the point ¢ dz, but which is not tangent to g at that point. We suppose 
that Mo is regularly represented neighboring the point t = do in the form 


= Bo") o not summed) 


and that (Bc) = (0) determines the point do on g. Set an —c, and let 
My be an alternative notation for the deflecting manifold M. The manifolds 
My, g=1,: are termed a set of inlermediate manifolds. 

Let the points t =a) and t =a), on g be denoted by A and B respec- 
tively. Let points P, on the respective intermediate manifolds M, be chosen 
so near to g that the successive points 


(8. 2) A, 


can be joined by extremal segments. 
Let (v) be a set of An variables, the q-th n of which are the parameters 


of the point Pz on M,; that is, 


For (v) sufficiently near (0), the points (8.2) are completely determined by 
the set (v) and the points A and B. The broken extremal E(v) joining the 
points (8.2) will be expressed in the form 


(8.3) a= X*(t, v), 


where X¢ and X;‘ are functions of class C? in their arguments for (v) near 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 267 


(0) and ¢ on each of the A + 1 components of H(v), and where (v) = (0) 
gives g. We assume that the parameter ¢ has been chosen so that t=? 
and t=?” give the points A and B respectively, so that the values t = dg 
(¢=1,:-+,A) give the respective points Pg on Mg, and so that between 
each two successive vertices (8.2) the rate of change of ¢ with respect to the 
arc length is constant. 

The integral J considered along E(v) is a function of class C? for (v) 
sufficiently near (0), and will be denoted by J(v). 


THEOREM 8.1. The function J(v) has a critical point for (v) = (0). 


To prove this theorem we consider the first partial derivatives of J(v) 
with respect to the variables v’,r = 1,---,An. Integrating by parts, setting 
(v) = (0), and making use of the fact that g satisfies the primary incidence 
relations at ¢c, we find that the first partial derivatives of J(v) with 
respect to the variables v” all vanish for (v) = (0). 

We set 

Q(v) = JS (r,s =1,°°-,An) 


where the superscript 0 indicates evaluation for (v) = (0), and term Q(v) 
the index form associated with the extremaloid g. If 7 is the rank of Q(v), 
the number An —? is termed the nullity of Q(v), and will be denoted by 
N(Q). The index of Q(v) is the number of negative characteristic roots 
belonging to | J°,",*|, and will be denoted by I(Q). 

In order to obtain a representation of the index form Q(v) in terms of 
the second variation we set up the family of broken extremals / determined 
by the points A and B and the set (ev',: - -, ev), where (v) is held fast 
and e is a variable near 0. The family of broken extremals E will be 


represented in the form 
(8. 4) = e) (t—=1,---,m) 


where the functions z‘(t,¢) are defined by referring to (8.3) and setting 
X‘(t,ev) =a2t(t,e). The family (8.4) has the property that for e near 0, 


w*(c, 6) == z*(ea',- - -, ea”), 


onot summed). 


Before proceeding with the problem of the second variation for the family 
(8.4) it will be convenient to present several definitions and prove a lemma. 
Let »‘(t) be a broken solution of the Jacobi equations which is defined 
and continuous for ¢ on g, and which has corners only at the points tc 


268 NANCY COLE. 


and t=do The broken solution 7‘(t) 
will be termed admissible if it satisfies the end conditions 
ni (t’) = 0, (t”) =0, (t—1,-- 


and if with a set (v) it satisfies the conditions 


(8. 5)’ ni(c) = a", (h=1,---,n=—m—1) 
(8. 5)” ni (do) = (0) Bo", (o not summed) 


at the corners tc and t =o respectively. If the broken solution y(t) is 
tangential, a necessary and sufficient condition that it be admissible is that it 
vanish at the points at which ¢ takes on. the values 


Any two admissible broken solutions of the Jacobi equations will be said 
to be equal mod T°, if their difference is an admissible broken tangential 
solution. Understanding that a solution determined mod T° means a solution 
which is unique except for the possible addition of an admissible broken 


tangential solution, Lemma 8. 1 is as follows. 


LemMA 8.1. An admissible broken solution of the Jacobi equations 
determines a unique set (v), and is determined mod T° by a set (v). 


The first part of the lemma fellows from the fact that each of the inter- 
mediate manifolds M, is regular. To prove the second part we note that the 
variation ‘(t) —2,-'(t,0) of the family (8.4) is an admissible broken 
solution corresponding to a given set (v). Let 7‘(t) be any other admissible 
broken solution corresponding to the same set (v). Then the difference 
w(t) =n*(t) —7*(t) is a broken solution of the Jacobi equations which 
vanishes at the points at which ¢ takes on the values (8.6). Since no one of 


the A + 1 segments 
(8. 7) StS (j =0,1,- - 


of g contains a conjugate point of its initial end point, w‘(t) is an admissible 
broken tangential solution. Hence y(t) and 7(t) are equal mod 7°, and 
the proof of the lemma is complete. 

Returning to the function J(ev), differentiating twice with respect to ¢, 


integrating by parts, and setting e = 0, we find that 
(8. 8) J = *] + = 0°, + 


(i—1,---,m; 


| 
| 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 269 


where 7‘(t) is the variation x,‘(t,0) of the family (8.4) and where « is 1 
for d@ on g, and 2 for do on gz. The terms corresponding to the limits ?¢ 
and ’” vanish in (8.8) as do the terms corresponding to the points ac. Hence 
we have the following theorem, readily proved with the aid of Lemma 8. 1. 


THEOREM 8.2. The index form Q(v) admits the representation 


c 
Q(v) + 4) dt + f 20? (x, ip) dt 
t’ e 
(h,k 


where n'(t) is any admissible broken solution of the Jacobi equations deter- 
mined mod T° by (v), and where 


On = y) — (9) (t= »m). 


An admissible broken solution of the Jacobi equations ‘(¢) will be 
termed a special solution if at t —c it satisfies the secondarv incidence rela- 
tions with w" = a therein and if corresponding to each corner ¢ = do there 
is a constant /# such that 


Ant = = by‘ (a0) 
We shall prove the following lemma. 


LemMa 8.2. A necessary and sufficient condition that a set (v) ~ (0) 
be a critical point of Q(v) is that (v) determine mod T° a special solution of 
the Jacobi equations. 


A necessary and sufficient condition that (v) ~ (0) be a critical point 
of Q(v) is that 
(8.9) (r=1,:--,An). 


We shall prove first that the conditions (8.9) imply that any admissible 
broken solution (7) determined mod T° by (v) is a special solution. For o 
fixed, the n members of (8.9) representing the partial derivatives of Q(v) 


with respect to (Bo',: - -, Bo”) may be written in the form 
(8. 10)’ = 0, 
where 

Ati = 


with k= 1 or 2, according as do is a point on g; Or gz respectively. Moreover 


(8. 10)” Atiyi (ac) = 0. 


270 NANCY COLE. 


The m equations (8.10) have a unique solution A{;—0. But from the 
definition of Ag;, we see that A¢; —0 implies that 


(8. 11) Ay! = ky* (ac) 
for o fixed. 

The n members of (8.9) representing the partial derivatives of Q(v) 
with respect to (a',: - -,a") may be written in the form 


(8.12) + = 0 (i= -,n). 


Thus (7) satisfies the secondary incidence relations at tc with wo* =a 
therein, and the condition of the lemma is necessary for a critical point. 

Conversely if (7) is a special solution determined mod T° by (v), (8. 12) 
holds for tc, and (8.11) holds for each o. It follows that (8.10) holds 
for each o. Hence (8.9) holds, and the condition of the. lemma is sufficient 
for a critical point. 

Understanding that solutions which are linearly independent mod T° are 
solutions which are linearly independent of admissible broken tangential 
solutions, the following lemma is readily proved. 


LeMMA 8.3. A set of admissible broken solutions of the Jacobi equa- 
tions are linearly independent mod T° or not, according as the sets (v) are 
linearly independent or not, and conversely. 


From Lemmas 8.2 and 8.3 we infer that the nullity of the index form 
is equal to the number of special solutions which are linearly independent 
mod T°. 

A solution of the Jacobi equations for ¢ on g which is of class (* for ¢ on 
g: and gz respectively and of class C? on each interval (8.7), which vanishes 
at t=? and t = ?”, and which satisfies the secondary incidence relations at 
i= c will be termed a reflected solution. A reflected solution is admissible 
if it satisfies conditions of the form (8.5)” at each point tac. An admissi- 
ble reflected solution is a special solution which has no corners at the points 
tds. We shall prove the following lemma. 


Lemma 8.4. Any special solution of the Jacobi equations is identically 
equal mod T° to an admissible reflected solution. 


Let 7‘(t) be any special solution. Let p(t) be a continuous function for 
4 on g with the following properties. On each interval (8.7), p(t) is of class 
C? with p(a;) = p(aj.1) =0, and with ¢(a;) and ¢(aj.1) so chosen that for 
t on g, the solution 


a(t) — p(t) 


( 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 271 


has no corners at the points 
Then 7‘(¢) is an admissible reflected solution, and the proof of the lemma is 
complete. 

With (7) and (7) defined as in the proof of Lemma 8.4, we see that 
(7) is identically equal to an admissible tangential reflected solution if and 
only if (7) =(0),mod 7°. Moreover, if a finite set S of special solutions 
be replaced by a set A of admissible reflected solutions, equal mod T° respec- 
tively to the special solutions of S, then the members of A are linearly 
independent of admissible tangential reflected solutions if and only if the 
members of S are linearly independent mod T°. 

We shall prove the following theorem. 


THEOREM 8.3. The nullity of the index form Q(v) equals the order of 
t’ as a conjugate point of Vv. 


If the nullity of Q(v) is v, there are v special solutions which are linearly 
independent mod 7°, and therefore v admissible reflected solutions which are 
linearly independent of admissible tangential reflected solutions. It follows 
from Theorem 7.6 that ¢” is a conjugate point of ¢’ of order v. 

On the other hand, if ¢” is a conjugate point of ¢ of order y, there are v 
reflected solutions ; 

(t) 


which are linearly independent of tangential reflected solutions. The solutions 
m'(t) are not necessarily admissible in that they may not satisfy conditions of 
the form (8.5)” at the points t=ace (o=1,:--,p—1,4+1,:--,A). 

But any reflected solution can be made admissible by adding a suitably 
chosen tangential reflected solution. Moreover, if a finite set R of reflected 
solutions be replaced by a set A of admissible reflected solutions which are 
equal, modulo a tangential reflected solution, respectively to the members of 
R, then the members of A are linearly independent of admissible tangential 
reflected solutions if and only if the members of F are linearly independent 
of tangential reflected solutions. 


We assume then, that the v reflected solutions 7‘(¢) which are linearly 
independent of tangential reflected solutions have been replaced by v admissi- 
ble reflected solutions 7‘(¢) which are equal, modulo a tangential reflected 
solution, respectively to the solutions m‘(t). It follows that the solutions: 
ix'(t) are linearly independent of admissible tangential reflected solutions. 
But the v admissible reflected solutions 7‘(¢) are special solutions which are 
of class C! for ¢ on g; and gz respectively. Hence the nullity of Q(v) is v. 

We continue with the following theorem. 


| 


272 NANCY COLE. 


THEOREM 8.4. The index of ()(v) equals the sum of the orders of the 
conjugate points of on’ <t<t”. 


To prove Theorem 8.4 we use the method of Morse (2, Th. 4.2). That 
is, we replace the extremaloid g by the subare g, on which ¢ [tb where 
It ’<bSc, then gy, is an extremal whereas if c< bX 
go is an extremaloid. On gy we introduce A intermediate manifolds as pre- 
viously with 


and with the points t = ag (¢q=1,-- -,A) so distributed on g» that no one 
of the A + 1 segments into which g» is thereby divided contains a conjugate 
point of its initial end point. For b > c, the point tc must be taken as 
one of the admissible points ¢ = ay, and the deflecting manifold used as the 
corresponding intermediate manifold. 

The family of broken extremals which hereby replaces the family X‘(t, v) 
is denoted by X‘(t,v,b). The functions J*(v) and Q°(v) are defined for g, 
as were J(v) and Q(v) for g. 

The proof of Theorem 8.4 will be based on the following statement: 


-(a). For any point b the index of Q°(v) is equal to the sum of the 
orders of the conjugate points of t’ on <t<b. 


Morse proved that statement (a) is true for b on g,; that is, for 
t’<bsc. It remains to prove that (a) is true forc< 

As b increases, the index of Q°(v) [written (Q°)] will change at most 
when 0b passes through a conjugate point a of v’ and will then increase by at 
most the order v of a as a conjugate point of ¢’. Hence for each value of 
b > c, we have 
(8. 14) 1(Q’) S Xu <a<b). 


We shall prove the following lemma. 


Lemma 8.5. Forb>c and nearer to c.than any conjugate point of ¢, 
excepting possibly c itself, 
(8. 15) = <a<b). 


With any admissible set (8. 13) for which b = ay,; satisfies the hypothesis 
of the lemma and a, — cc, we proceed as follows: We denote a),1 by @+2 OF b, 
and a) by as: or c, and insert a new point a, between a,., and c. We intro- 
duce a new intermediate manifold M) cutting g» at a, but not tangent to y 
at ay. For this construction we replace the set of parameters (wv), by a set of 
(A-+1)n parameters (€), the first (A—1)n and the last n of which form 


| 
| 
fs 


t 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 273 


the set (v). The broken extremal €(€) determined by (€) and the end points 
of g, coincides with the broken extremal X‘(t,v,b) for StS a, and 
b. 

Understanding that J.”(é) and @.(é) denote the functions replacing 
J*(v) and Q°(v), we shall first prove that 


(8. 16) 21(Q.”). 
To that end let e be a parameter near 0 and set 
o(e) —J*(ev). 


When the first (A —1)n and the last n components of (€) are given by (v), 
the inequality ¢(¢) = 0. holds for e sufficiently near 0, and ¢(e) has a mini- 
mum for e=0. Hence $”(0) 20, and (8.16) follows with the aid of 
Lemma 3. 1a of Morse 2. 

Next we set the last n variables of (€) equal to zero in Q.?(€), and obtain 
thereby a quadratic form Q.°. Applying Lemma 3. 2b of Morse 2 we see that 


(8. 17) =1(Qo°) + N(Qo°), 


where 7(Qo°) and N(Qo°) denote the index and nullity respectively of Qo’. 
But since the nullity of Qo° is equal to the order of ¢ as a conjugate point of 
t and since the index of Qo° is equal to the sum of the orders of the conjugate 
points of t’ on t’ << t<c, the inequality (8.15) follows from (8.16) and 
(8.17), and the proof of the lemma is complete. 

Upon comparing the inequalities (8.14) and (8.15), we see that for 
b> c, and sufficiently near c, statement («) is true. 

That statement («) holds for any point b > on gz can be proved by 
taking an arbitrary point 1°, where c < 1° ?”, and showing (1) that if («) 
is true for b < ¢°, then it is true for b = ¢° and (2) that if (a) is true for 
b= 1°, then it is true for 6 > ¢°. The method of proof is similar, and the 
details will be omitted. See Morse 2, Lemmas C and D. 

The index and nullity of Q(v) are independent of the number, position 
and representation of the intermediate manifolds, provided they are admissi- 
bly distributed and represented, and one intermediate manifold coincides with 
the deflecting manifold. The index and nullity of Q(v) depend only on the 
conjugate points of ¢’ on g, and their orders. 

tecall that the nullity of a critical point of a function of a finite num- 
ber of variables is defined as the nullity of the Hessian of the function 
at the critical point, and that the index of a critical point is defined as the 
number of negative characteristic roots of the Hessian of the function at the 
critical point. Understanding that each conjugate point is to be counted a 


he 
at 

re 

e 
te 
aS 

e 

t 
) 


274 NANCY COLE. 


rumber of times equal to its order, we summarize our results in the Index 


Theorem. 


INDEX THEOREM. The point (v) = (0) ts a critical point of J(v) with 
an index equal to the number of conjugate points of tt’ on g preceding 
t=”, and a nullity equal to the order of t=’ as a conjugate point of t=’. 


Let ta and t= 6b be any two points on g. We shall prove the following 


theorem. 


THEOREM 8.5. The numbers of zeros of the two conjugate point deter- 
minants Dj(t,a) and D,(t,b) on any finite open interval of g differ by at 


most n, where n= m — 1. 
Let p< t <q be any finite open interval of g. Let r be a point follow- 


ing pon p<t<q. Suppose r is not a or b orc and that there is no con- 
jugate point of a or b on p< tr. Similarly let s bea point preceding 
qgonr<t<q. Suppose s is not a or b or ¢ and that there is no conjugate 
point of a or b ons=t<q. Understanding that Q(rs) denotes the index 
form corresponding to the finite open interval r << t < s of g, and that I(rs) 
denotes the index of Q(rs), we shall first establish the following statement. 


(a). The number of zeros of Dy(t,a) onr<t<s is equal to 
(8. 18) I(rs) +k 
where k is an wmteger or zero. 


There are three cases to be considered: 


Case l. ax<r<s, 
Case 2. r<s<a, 
Case 3. r<a<s. 


To prove Case 1 we first set up index forms Q(ar) and Q(rs). We then 
set up Q(as) taking one intermediate manifold, M,, at the point r and the 
same intermediate manifolds preceding and following M,; as are used to define 
Q(ar) and Q(rs) respectively. When the variables of Q(as) belonging to the 
intermediate manifolds preceding and following M, are the same as the 
variables of Q(ar) and Q(rs) respectively, the quadratic form obtained by 
setting the n variables belonging to M, in Q(as) equal to zero is equal to 


Q(ar) + Q(rs). Hence 
I(as) —n SI (ar) + I(rs) ST (as). 


THE INDEX THEOREM FOR A CALCULUS OF VARIATIONS PROBLEM. 275 


See Morse 1, p. 62, Lemma 7.2. It follows that 
(8. 19) I(as) —I(ar) =I(rs) +k, (OSkSn) 


where & is an integer or zero. But, since conjugate points are counted accord- 

ing to their orders, the left member of (8.19) represents the number of con- 

jugate points of a on r<t<-s. Moreover, since the order of a conjugate 

point of a is equal to the order of vanishing of D,(t,a) at that point, the 

number of zeros of D,(t,a) on r<t<s is given by (8.18) whena<rc<s. 
For Case 2 we interchange the réles of r and s in Case 1 and obtain 


I(ar) —I(as) =I(sr) +k, (9S=kSn) 


where & is an integer or zero. But J(sr) =TJ(rs), and statement (a) follows 
as in Case 1. 

For Case 3 we first set up index forms Q(ra) and Q(as), and then Q(rs), 
taking one intermediate manifold, Ma, at a to define Q(rs), and taking the 
same intermediate manifolds preceding and following M, as are used for Q(ra) 
and Q(as) respectively. If in particular a—c, then Ma must be taken as 
the deflecting manifold M. As in Case 1 we have 


I(rs) —nSI(ra) + I(as) = 1 (rs). 
Since =I (ra), it follows that 
(8. 20) I(ar) +I(as) =I(rs) —n +k, (OSkSn) 


where & is an integer or zero. The number of conjugate points of a on 
r<t<-s is given by the left member of (8.20). But since the conjugate 
point determinant D,(t,a) vanishes to the n-th order at a, and since 
r<a<s, the number of zeros of D,(t,a) on r<t<s is given by the 
right member of (8.20) increased by n. This completes the proof of 


statement (a). 
Returning to the theorem, we see that the numbers of zeros of D,(t, a) 


and D,(t,b) on r<t<-ss differ by at most n. From our choice of r and s, 
it follows that the numbers of zeros of D,(t,a) and D,(t,b) on p<t<q 
differ by at most n. 

The following is an easy corollary. 


CoroLtuary 8.5. The numbers of zeros of two conjugate point deter- 
minants D,(t,a) and D,(t,b) on any finite interval (open or closed) of g 
differ by at most n, where n = m—1. 


Conclusion. The index theory can be directly extended to the case that 
g is a broken extremal with any finite number of corners, at each of which g 


= 
= 
— | 


276 


NANCY COLE. 


is cut across by a regular (m—1)-manifold of class C?, not tangent to either 
arc of g at the corner, and at each of which g satisfies a corresponding set of 
primary incidence relations. Moreover, if the initial end point ¢’ of a broken 
extremal g lies on a regular (m—1)-manifold 9 of class C?, which cuts g 
transversally at ¢’, but is not tangent to g at ¢’, the index theory has corre- 
sponding theorems, stated in terms of “ focal” points of 9% and their orders, 


Sweet BRIAR COLLEGE, 
SwEetT BRIAR, VIRGINIA. 


Bolza, O. 


i. 


Bliss, G. 


BIBLIOGRAPHY. 


Vorlesungen iiber Variationsrechnung, Berlin, Teubner (1909). 


A. 
(With Mason, M.) “A problem of the calculus of variations in which the 
integrand is discontinuous,’ Transactions of the American Mathematical 
Society, vol. 7 (1906), pp. 325-336. 

(With Mason, M.) “The properties of curves in space which minimize a 
definite integral,” Transactions of the American Mathematical Society, vol. 9 
(1908), pp. 440-466. 

“ Jacobi’s condition for problems of the calculus of variations in parametric 
form,” Transactions of the American Mathematical Society, vol. 17 (1916), 
pp. 195-206. 


Graves, L. M. 


1. 


Miles, E. 


Morse, M. 


2. 


“Discontinuous solutions in space problems of the calculus of variations,” 
American Journal of Mathematics, vol. 52 (1930), pp. 1-28. 
“Discontinuous solutions in the calculus of variations,” Bulletin of the 
American Mathematical Society, vol. 36 (1930), pp. 831-846. This paper 
contains further references. 
J. 

. “Some properties of space curves minimizing a definite integral with dis- 


continuous integrand,” Bulletin of the American Mathematical Society, vol. 
20 (1913), pp. 11-19. 


“The calculus of variations in the large,” American Mathematical Society 
Colloquium Publications 18, New York (1934). 

“The index theorem in the calculus of variations,” Duke Mathematical 
Journal, vol. 4 (1938), pp. 231-246. 


|_| 
{ 
2. 
3. 1 
] 
] 
J 
1 


ON 0-REGULAR TRANSFORMATIONS.* 
By A. D. WALLACE. 


1. Introduction. In this paper we consider a particular type of interior 
transformation which we call a 0-regular transformation. A mapping of this 
type may be roughly described by saying that the inverse sets (of points) are 
uniformly locally connected and in addition form a continuous collection. 
More accurately we require of the continuous transformation 7'(A) = B that 
for any sequence b,—>b in B we have (i) T-'(b,) ~T“(b) and (ii) this 
convergence be regular relative to 0-cycles in the sense of Whyburn. The 
condition (i) is a characterization of interior transformations due to Hilen- 
berg. There are many obvious generalizations of this notion with which we 
shall not be concerned. 

We show that any 0-regular transformation may be factored into two 
0-regular transformations of which the first is monotone and the second of 
constant multiplicity. This result is important in studying the effect of the 
transformation. It is also shown that 0-regular convergence is preserved under 
the inverse of a 0-regular transformation. We prove that (the mapped space 
being a locally connected continuum) cut-points, end-points and A-sets map 
respectively into cut-points, end-points and A-sets. In particular a 0-regular 
transformation is topological on a dendrite, and is monotone if the image 


space is a dendrite. 


2. General theorems. We suppose throughout that 7(A) =B is a 
continuous transformation defined on the metric space A, and that B contains 
more than one point. The following definition is due to G. T. Whyburn [1]: 
If the sequence of closed sets {M,} converges to M, then Mn— M 0-regularly 
provided that for each e > 0 there are positive numbers 6 and N such that for 
n> N any two points z and y in M, with p(a, y) < 4, lie in an e-continuum * 
in Mn. It is readily seen that the following result holds: 


(2.1) If M,—M in a compact metric space, then in order that the con- 
vergence be 0-regular it is necessary and sufficient that for each positive 
there exist positive numbers 8 and N such that for peM and n>N, the set 
Vs(p) -M,, is contained in a connected subset of Ve(p) 


* Received July 31, 1939. 
* An e-set is a set of diameter less than e. 
* The symbol Vp) denotes the set of points not farther from p than e. 


277 


. 


278 A. D. WALLACE. 


The following example is of interest: Let {fn(z)} be a sequence of real- 
valued continuous functions defined on the unit interval and converging to 
the function f(z). Let M; be the graph of the function y = f;(z). In order 
that fn(z) fo(x) uniformly it is necessary and sufficient that M, M, 
0-regularly. 

I shall say that the transformation T(A) = B ts 0-regular provided that 
if yn y in B, the sets T-* (yn) > T“*(y) 0-regularly. It follows immediately 
from a theorem due to Hilenberg [2] that if T is 0-regular it is interior; that 
is, open sets map into open sets [3]. The proof of the following result pre- 
sents no difficulty : 


(2.2) In order that the interior transformation T(A) =B be 0-regular, 
where A is compact, it is necessary and sufficient that for each «> 0 there 
exist a8 > 0, such that tf x and y are in A with p(a,y) <8 and T(z) =T(y), 
then x and y lie in an e-continuum in T?T (x) = TT (y). 


(2.3) If T(A) =B is interior, where A 1s compact, and if the sequence of 
closed sets Y,->Y in B, then we have T*(Yn) ~T7(Y), and if this con- 
vergence is 0-regular so is the convergence Yn, Y. 


Proof. For the proof that T-'(Yn) >T-(Y) see [4]. Assume that 
the convergence is 0-regular. Let « be a positive number and e a positive 
number such that if ae A and b = T (a) then T(Ve(a)) C V-(b) ; take d > 0 
and N > 0 for this e as in (2.1). Pick § > 0 so that if be B and ae T(b) 
we have V3(b) C T(Va(a)). This latter is possible by a theorem due to 
G. T. Whyburn [5]. Let p be any point of Y and let z and y be points 
of Vs(p) Yn, n > N. If ge T7(p) C T7(Y), then Va(p) Yn 
C T(Va(q)-T*(¥n)) and we can find points 2’ and y’ in Va(q) -T7*(Yn) 
mapping into x and y respectively. Since n > N we know that z and y lie 
in a connected subset H of Ve(q)-T7-*(Yn) in virtue of the 0-regular con- 
vergence T7(Y). Hence T(H) C T(Ve(q) T?(¥n)) 
C V.(p):Yn. Thus 2’ and y’ lie in a connected subset of V.(p) > Yn. This 
completes the proof in virtue of (2.1) 


(2.31) If T(A) =B is 0-regular and A is compact, and T is factored, 
T =T.T,, so that T,(A) =A’ is interior, then T,(A’) = B is 0-regular. 


Proof. Suppose that in B. Then T-(y,) > 0-regularly 
in A. Hence 7,T(yn) > 7:T-(y) 0-regularly by (2.3); but for any b«B 
we have 7,7-1(b) =T.1(b), so that T27*(yn) > T2*(y) 0-regularly. 

The following example shows that this theorem is false if 7; is not 
interior: Let A be the circle | z|—1 and B the circle | w| =1 and T the 


t 
( 
| 
| 
| 
| 


ON 0-REGULAR TRANSFORMATIONS. 279 


transformation w= 2z*. Let 7, be the transformation mapping A into a 
lemniscate by identifying the points 1 and —1. Then T, is not 0-regular. 
Before proceeding to the proof of a factor theorem we need the following 


Lemma. Let {M,} be a sequence of disjoint locally connected closed sets 
converging O-regularly to the locally connected set M in a compact metric 
space. Suppose that M-M,=—0. If M=X*+-:--:+X* ts a decom- 
position into components then there is an integer N such that if n>N, 
M,=X,2+---°-+Xn* is a decomposition into components and for each 
i=1,2,---,k, we have 0-regularly and if 1 j, X*- lim 
is vacuous. 


Proof. Let X be a component of M and {Xn} a sequence of components 
of the sets {Mn} so chosen that XY and lim inf X, have a point in common. 
Let {Xn,} be a convergent subsequence, say Xn,—> X’. From the local con- 
nectivity of M, it follows that Mn—X, is closed. Let {Mn,, —Xn,,} be a 
convergent sequence chosen from the sequence {Mn,— Xn,} and converging 
toaset Y. It follows that M Y + X’ and since (Mn,, —Xn,,) Xn, = 9, 
we have Y - VY’ = 0 as a consequence of the 0-regular convergence [1]. Since 
X-X’ 0 and X’ is a continuum it follows that Y =X’. Thus every con- 
vergent subsequence of {X,} converges to X, so that XY = lim Xp. 

It follows immediately from the definition of 0-regular convergence that 
X lim sup (M, — Xn) is empty so that lim sup (Mn— Xn) C M—AX; from 
the fact that Xn — X, a component of M, we deduce that M — X 
Clim inf (M,—2X,). Hence Since (Mn—Xn)-Xn 
=(0 and (M—X):X =0 we conclude that the former sets converge 
0-regularly to the latter and thus [1] that Y,—> XY and M,—Xn7>M—X 
Q-regularly. The proof of the lemma may be carried through by induction. 

The transformation 7'(A) = B is locally topological or a local homeo- 
morphism, provided that it is interior and that each point in A admits a 
neighborhood on 7’ is topological [6]; Z is said to be monotone provided 
that for each b e B, the set T-*(b) is connected [7]. 


(2.4) If T(A) =B is 0-regular and A is a continuum, then T can be 
factored T —T.T, so that 
(i) 7,(A) =A’ is monotone and 0-regular 
(ii) T,(A’) = B is of constant multiplicity and locally topological. 
Proof. As a consequence of a theorem due to Whyburn [8] we know that 


T can be factored so that T, is monotone and 7’, is interior. Also for each 
beB it follows that T-1(b) is locally connected. If yn—>y in A’ we must 


‘ 
0 


280 A. D. WALLACE. 


show that 7,-!(yn) > 7,7(y) 0-regularly. But from the proof of the factor 
theorem just cited it follows that each set 7',"'(r), xe A’, is a component of 
and since T is 0-regular T*T2(yn) >T“T2(y). But 7, is con- 
tinuous so that lim sup #0, so that by the lemma 7’,"'(y,) 
—T,"(y) 0-regularly. This completes the proof of (i). 

By (2.31) 7. is 0-regular since we have shown that 7’; is 0-regular and 
hence interior. Let p(b) be the number of points in 7.,-1(b), that is, the 
number of components in 7-1(b). By the lemma p(b) is a continuous 
function and since B is connected it follows that p(b) is constant. It readily 
follows from this and the 0-regularity of T, that this transformation is 
locally topological. 

As a matter of convenience we state explicitly the following: 


(2.41) If T(A) = Bis 0-regular on the continuum A and X 1s a component 
of T-(y) then there exists a sequence of points yn—y in B and a sequence 
of components {Xn} of {T-*(yn)} converging 0-regularly to X. The sequence 
{Xn} is essentially unique in the sense that if {Zn} 1s any sequence of com- 
ponents of {T-*(yn)} converging to X, then the sequences {Xn} and {Z,} 
differ only in a finite number of terms. 


(2.5) In order that the transformation T(A) =B be 0-regular on the 
compact space A, it is necessary and sufficient that for any sequence of closed 
sets Y,—>Y 0-regularly we have T*(Yn) ~T7(Y) 0-regularly. 


Proof. The sufficiency of the condition is obvious. Assume now that 
T is 0-regular and let Y,-— Y 0-regularly. Since 7 is interior it follows 
that T“(Yn) >T“(y) [4]. Let « > 0 and select u > 0 from (2.2) so that 
if z and w are any two points of A such that 7'(z) = T(w) then z and w lie 
in an ¢/3 continuum in T-17'(z). Let ¢ be a positive number less than the 
smaller of u/3 and «/3. Let e > 0 be so chosen that if be B and ae T(b), 
then Ve(bpCT(Vi(a)) [8]. Since Yn— Y 0-regularly there are positive 
numbers d and WN such that if n> N and qe Y, then any two points of 
Va(q)- Yn lie in a connected subset of Ve(q)- Yn, by (2.1). Since T is 
continuous and A is compact there is a positive number 6 such that if aeA 
and b=T (a) then T(Vs(a)) C Va(b). If and peT I have 
to show that any two points z and y in Vs(p)-7-(Yn) lie in a continuum 
in V.(p)-T“(Y,). To this end it is shown that x and y can be r-chained 
for all small positive r in the set Vee/s(p) -T*(Yn). 

Let r be positive and less than u/3 and pick s > 0 so that if be B and 
aeT-*(b) then V,(b) CT(V,(a)). Since and y are in Va(p) T(Yn) 
then 2’ and y’ (the images of x and y respectively) are in Va(p’), p’ =T(p): 


| 

[ 


if 


ON 0-REGULAR TRANSFORMATIONS. 281 


It follows that x’ + y’Ca connected subset of Ve(p’)- Yn. Hence we can 
find a chain 


x’ = bo, oy’, bye Ve(p’) Ya, p(b;, <2. 
Now T(Vi(p)-T*(¥n)) so that we can find points 


© =o, * =Y, T (a;) = 
Since 
€ Va Yn C T(Vr(a;) ) 


there is a point cj,, in V,(a;)-T+4(Yn) mapping onto b;.,. For each 7 we 
thus have p(dj, ¢j1) <7. Also 


Cis) S p(4j, Cir) + p(4i, P) + p( Qin, p) <r + 2t <u. 


Hence aj,,; and cj,; lie in a continuum K;j,, of diameter less than ¢/3 in 
TT C T*(yn). Since p(aju,p) < «/3, no point of Kj,, is farther 
from p than 2«/3. We can now chain aj, to cj. by an r-chain in 
Voess(p)*T*(Y¥n). Thus x can be r-chained to y for all small r in 
Voes(p)*T*(¥n) and it follows that « and y lie in a connected subset 

It is clear that (2.5) implies the following: 


(2.51) If Y is a locally connected closed set in B then T*(Y) is locally 
connected. 


From (2.5) we also get the following product theorem: 


(2.6) If T,(A) =A’ and T,(A’) =B are 0-regular where A is compact, 
then so also is T = T.T. 


Proof. For if yn— y in B, then T.*(yn) > T2*(y) 0-regularly. Hence 
(Yn) Q-regularly. But for any beB we have 
= T."T,-1(b). 


3. 0-Regular transformations on Continua. In this section we shall 
suppose that 7'(A) = B is 0-regular and that A is a continuum. 


(3.1) If H is any subset of B then T is 0-regular on T*(H). If H isa 
connected subset of B then T is 0-regular on each of the finite number of 
components of T-(H) and each such component maps onto all of H under T. 


*This proof is similar to proofs given by G. T. Whyburn [5] and W. T. Puckett 
(11) for somewhat different results. The result (2.51) has also been proved in the 
cited paper of Puckett. 


4 


d 
§ 
y 

ut 
} 
d 
it 
e 
); 
e 
f 
( 
] 


282 A. D. WALLACE. 


Proof. The proof of the first statement is immediate. Assume that H 
is a connected subset of B and 7*(H7)=—P+ @Q where PQ +PQ=0. It 
is readily seen that if P40 we have T(P) =H and since for each beB 
the set 7-*(b) has only a finite number of components, there are at most a 
finite number of components in 7-'(H). Since any one of these can be 
taken as P in the above decomposition it only remains to show that 7’ is 
(-regular on each component. If K is a component of 7-'(#) then K is 
open in 7-'(/7) and hence T is interior on K. Thus if yn—y in I we have 
K-T"(yn) K-T(y). But if H then any component of 7-1(b) which 
intersects K certainly lies in K. Hence each component of K-7-'(b) is 
also a component of 7-1(b). By (2.41) the result follows. 


(3.2) If the continuum X separates A and lies in the inverse of a point 
re B, then X is a component of T(z). 


Proof. We may write 
A=H+K, =X, 


where #7 and K are continua. If ye B—z, then clearly any component of 
T-‘(y) is either in —X or K—X. Let X’ be the component of 
containing X. In H we can select a sequence of points {zn} not in X and 
which converge to a point ze XY. If XY, is the component of TT (z,) which 
contains z, then X, is in 1] — X and by (2.41) Xn—X’. Hence X’C UH. 
Similarly X’ C K, so that X’. 


(3.3) If the closed set X separates A irreducibly between two points and 
is contained in a component of the inverse of a point, then X is this component. 


(3.4) If the point x of A is an end-point, a regular point in the sense of 
Menger, or a cul-point of A, then x is a component of T“T (x), and T is 


locally topological in a neighborhood of «x. 


Proof. The first two cases follow because x cannot lie on a continuum 
of convergence. The third follows from (3.2) 
(3.5) If the continuum A does not contain uncountably many not- 
generate mutually exclusive continua, then each 0-regular transformation on 


A is locally topological. 


Proof. If we assume that the result is false we can find a non-degeneraté 
component X of the inverse of a point «eB, and a neighborhood U of x 
such that for each ye B. any component of 7-1(y) which intersects U 
non-degenerate. But 7'(U) is a neighborhood of x and since B is a continu 


im 


oll 


ON 0-REGULAR TRANSFORMATIONS. 283 


T(U) contains uncountably many points. Hence A contains uncountably 
many non-degenerate mutually exclusive continua. 


4. Results for locally connected continua. We now suppose that A 
is a locally connected continuum and 7’ is 0-regular unless the contrary is 
explicitly stated. We also assume a knowledge of the cyclic element theory 
of such spaces, Kuratowski and Whyburn [9]. 


(4.1) Jf T is a local homeomorphism and J is a simple closed curve in B, 
then each component of T(J) is a simple closed curve mapping onto all of J 
under T. If Disa dendrite in B then there k components in T-!(D) (k being 
the mulliplicity of T) each of which is a dendrite mapping topologically onto 
D under T. 


Proof. If Z is a locally connected continuum and Z, is a component of 
T7(Z) then 7'(Z,) =Z is locally topological and hence Z, is a locally con- 
nected continuum. The proof of the first statement is immediate. Let D, be 
a component of 7-'(D). Since 7'(D,) = D is interior there exists a dendrite 
D/C D, such that 7'(D’) = D is topological by a theorem due to Whyburn 
[10]. But since 7’ is locally topological it is clear that D’ = D,. 


(4.11) Jf A is a dendrite then T is topological, if B is a dendrite then T 


is monotone. 


Proof. Using the notation of (2.4), if A is a dendrite then A’ is a 
dendrite since 7’; is monotone. As in (4.1) it follows that 7, is topological. 
Hence 7 is monotone. But by (3.4) each point «eA is a component of 


(a2). Similar reasoning applies to the second statement. 
(4.2) Jf is a cul-point of A then y= T(z) is a cut-point of B. 


Proof. By (3.4) and with the notation of (2.4) we see that a is a 
component of i.e, since 7, is monotone. If 
y=T',(x) did not cut A’ then A’ —y, would be connected and hence so 
would 7,-1(A’—y,) =A—z. Hence y, cuts A’. Thus we may write 
A’=M+N, MN =y,, where M and N are non-degenerate continua. Now 
T'.(y:) = yz is clearly not an end-point and if it is not a cut-point it lies in 
a true cyclic element Z of B. We can find arcs ay, and by, in M and N 
respectively such that 7. is topological on the arc ay, + y,b. Let T.(a) =e, 
T'.(b) =d. If cy,—ys were in B— FE it would lie in some component R 
of this set. But then clearly we would have F(Kk) = R— R= y, and hence 


Y2 18 a cut-point, contrary to our assumption. There are thus some subarcs 


of cys and y.d in E and we may assume that cy, + y2d is a subset of Z. Now 


| 
It 

B 

a 
he 

is 

is 

ve 
ch 

is 

nt 
L) 
nd 
ch 

dl 5 
it, 

of 

is 

= 
ite 

X 

i 
m 


284 A. D. WALLACE. 


c and d lie on a simple closed curve J’ in LH, J’=cpd-+cqd. Let cpd be 
the arc of J’ not containing yz. We may assume that c is the first point on 
y2c in cpd and that d is the first point on yd in cpd. Finally let J be the 
simple closed curve cy.d + cpd, and let J; and Jz be the components of 
T(J) containing ay, and y,b respectively. Now J, and J» are simple 
closed curves by (4.1) and since no simple closed curve can have points 
other than y, in both M and N it follows that J, and J, are different. Hence 
two different components of 7-'(J) have the point y,; in common. This is 
a contradiction. 

(4.3) Jf H is an A-set in A then T(H) is an A-set in B. 


Proof. Wet 1s; be an arc in A’ with r, and s, in H, = 7,(H). 
Since 7',-1(7,5,) is a locally connected continuum we can find an arc rs in 
this set with r and s in 77. Since H is an A-set 7s lies in H and hence 
T (rs) = 1,8; lies in H,. Let H,—T.(H,) and assume that //, is not an 
A-set. We can then find an are u22.v2 in B with uga.v2 H, = uz + v2. There 
is aN are Usyov2 in Ho Let J = ustove + Usy2v2 and let y, be a point of 
H, T."'(y2). We can find a neighborhood U of y; on which T, is topological 
and V=T.(U) will be a neighborhood of yz. Now yz is in V and hence 
a subarc t, of Wey2v2 containing y, lies in V. We can thus find an arc ¢, of 
H, which maps topologically onto ¢,. Let J; be the component of 7-'(/) 
which contains t;. Since H, is an A-set it is clear that J, is in H7, and hence 
we have J in H. Thus w.2.v. is in H. 


THE UNIVERSITY OF VIRGINIA. 


BIBLIOGRAPHY 


G. T. Whyburn, Fundamenta Mathematicae, vol. 35 (1935), p. 408. 

S. Eilenberg, Fundamenta Mathematicae, vol. 22 (1934), p. 292. 

Stoilow, Principes topologiques de la théorie des fonctions, Paris, 1938. 

A. D. Wallace, American Journal of Mathematics, vol. 61 (1939), p. 757. 

G. T. Whyburn, Duke Mathematical Journal, vol. 4 (1938), p. 1. 

S. Eilenberg, Fundamenta Mathematicae, vol. 24 (1935), p. 160. 

G. T. Whyburn, American Journal of Mathematics, vol. 56 (1934), p. 370. 

G. T. Whyburn, Duke Mathematical Journal, vol. 3 (1937), p. 370. 
Kuratowski and Whyburn, Fundamenta Mathematicae, vol. 16 (1930), p. 305. 
10. G. T. Whyburn, Bulletin of the American Mathematical Society, vol. 44 


(1938), p. 414, 
1l. W. T. Puckett, American Journal of Mathematics, vol. 61 (1939), p. 750. 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE.* + 


By Louis GREEN. 


1. Introduction. Various methods have been employed in investigating 
the projective differential properties of a curve immersed in ordinary space, 
each method having certain advantages. The procedure used here is to start 
with a pair of dual differential equations, to introcuce certain transforma- 
tions of coordinates in order to obtain canonical power-series expansions for 
the curve considered, and to base the remainder of the paper on these expan- 
sions. The objectives of the paper are to characterize certain configurations 
associated with a curve, particularly the five-point twisted cubics, and to begin 
the problem of interpreting a duality formula in geometrical language. 


2. Analytic basis. The differential equations of a twisted curve I, 
not belonging to a linear complex, may be written in the form 


(1.1) + av” + (a’—6)a’ + cr = 0, 
(1. 2) (a+ + cé=0, 


€ represents the osculating plane of T at the point 2; differentiation is taken 


(9 = const. ~ 0). 


with respect to a properly chosen parameter w; and a, ¢ are scalar functions 
of wu. The value of @ can be chosen arbitrarily (0); if 6——1, these 
equations are the ones derived by Fubini and Cech;* if 6=— 4, then (1.1) 
is the canonical form of Halphen. 

When w is fixed at a suitable value wo, a point O (=) on T is obtained, 
and a local tetrahedron of reference D,{, 2’, 2”, 2”’} is formed, with a unit 
point chosen so that any point whose codrdinates in the original system are 


+ + 2,07 + aya” 


will have local cordinates proportional to 7,,- - -,a4. It follows readily that 
the local codrdinates 2; of a point P on T “ sufficiently near” O are 


where 
Aso = 1, = Ago = Ago = 0, 
A As —cA 
As = Ain + A’on + (n = 0) 
+ A’ sn aA gn, 
A 4,241 = Asn + A’ 
* Received February 13, 1939. + Presented to the Society, September 6, 1938. 


Introduction & la Géométrie Projective Différentielle des Surfaces, 1931, p. 26. 


285 


— 
| 


286 LOUIS GREEN. 


Halphen’s local tetrahedron /7, which will be used throughout this paper | 
can be obtained directly from D,;. If local codrdinates referred to //, are | 
denoted by yi, then the following relations hold: 


Yi /T = Ly + + 


(2.1) (7 arbitrary) 

Y3/T = + 

= 


= /200, = — 18006?) 

— — 10800a’6* + 144006") /3600086*, 
22 = 24 = — 42000) /6006", 

= 100c — 9a? — 30a”, = 


If non-homogeneous codrdinates x, y, z are defined in the customary way, 


the equations of T, relative to O as origin, are found to be 


7 6 


The coefficients pr, gn (v1 = 7) are expressible in terms of ge and the coefli- 
cients in the differential equation (1.1), and are understood, of course, to be 
evaluated at wu). The value of q, as is the case with 6, can be chosen 
arbitrarily (40), but since a numerical choice for gg would prevent us from 
displaying the weights? of the coefficients, we merely specify that q be 
independent of the parameter u. The values of q:, Ys, pz are found from 
(2.1) to be 

dz = $/4200y', 
(4.1) qs = ($7 — 454'6 — 36008") /504000Y°9, 

Pr = (— — — 36006?) /1512000y°9. 


Let + (=€) be the plane osculating T at 0. Then a local plane tetra- 
hedron of reference D.{é, &, &’,@} with local coérdinates proportional to 


&s, 1f plane codrdinates in the original system are 

is formed from the differential equation (1.2). We replace D, by a new local 

plane tetrahedron H., dual to H,. If local plane coérdinates referred to //. 

are denoted by 4, then the relations between & and 7 are the same as those 

between 2; and y; in (2.1) except for the change in sign of 6. 


* The weights of p, and qg, are n— 2 and n— 3 respectively. 


0) 


p 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 


Setting 
we obtain the equations of T in system H,: 
00 
7 6 
where we choose 
Kg = 
The values of «;, Ks, 7; are obtained from q;, qs, pz aS given in equations (4. 1) 


by changing the sign of 6. It then follows that 


4,2 
Kx = (7469s — — 147") /Yo, = (2469s — — 28q,7) 


Returning to tetrahedron /7, and equations (3.1) we denote homogeneous 


plane cobrdinates by and non-homogeneous coédrdinates by é, 7, ¢, where 


Then the plane equations of T in system //, are 


A — + (89, — 21p,)E7/2187 -, 


The transformation 


2% Jonas 

= 81 — 

76°91 — + + (54G6* — 

carries (5) into (3.2) and hence is the transformation from H, to I/,. The 


coordinates of the vertices of tetrahedron //, referred to system //, are thus 
found to be 


(1,0,0,0), Go, 9, 9), 
14qoqz, 396", 0), — 147 27 G0"). 


— 
— 


The w-derivatives of the local point codrdinates a; of system D, are 
obtained in the following way. In the original codrdinate system any point z 
In space has codrdinates 


= 2,0 + + 

Hence, from (1.1), 
= (2’; + 4, + On, — 2’ 
+ + aay) a" + (a 


287 

) 
] 


288 LOUIS GREEN. 


and placing 2;=0 (t=—1,---,4), we get 

== CL 4, = — 2, + (a — 

( ) U's = — 1, + — 73. 
The formulas for the derivatives of local codrdinates y; of system H, are 
found by differentiating equations (2.1), in which we choose 

r= exp(fa2du), 

and by replacing the a’; and the a; by their values in terms of y; as obtained 
from (8) and the inverse of (2.1). The resulting equations are 


= — 1q7Y2 — 42p7y3 — 18 
oY = — + 149743 
= — + 21q7ys- 


Now yAu +: -, so that at u= uw, 
y's = dyi/du = ydyi/dz. 


Hence the z-derivatives of the local codrdinates y; are‘ gotten immediately. 
These have also been obtained by Miss Newton.° 


8. Duality. The differential equations of T show that relative to I at 
0 the dual of a point x; = f;(a,c,@) in system D, is the plane & =f; (a, c,—9) 
in system D,. From the similarity of the canonical expansions (3.1) and 
(3.2) it follows that the dual of a point yi = fi (pn, qm) referred to H, is the 
plane = fi (an, «m) referred to The codrdinates of this plane referred 
to H, are then obtainable from transformation (6). The problem of finding 
the dual, relative to T at 0, of a given point is therefore solved. 

But there appears to be a good deal more to the problem than this. For, 
the concept of. duality considered here is quite different from the duality 
theory of projective geometry. Two coincident points, for example, may have 
distinct dual planes. Thus, the points 


in system H, both lie on the z-axis and for properly chosen values of 1’, m’, 1’ 
coincide. Yet their dual planes, which can be found by the method described 
above, are distinct. Furthermore, in order to obtain complete generality for 


*“ Consecutive covariant configurations at a point of a space curve,” Transactions 
of the American Mathematical Society, vol. 36 (1934), p. 61. 


| 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 289 


both the curve I and the position of the point 0 on I, we do not wish to specify 
the relations existing, at wu = Uo, among the coefficients in the equations (3.1). 
We therefore regard P as one of a three-parameter family of points generating 
the z-axis as 1, m, mn vary independently over all real numbers; i.e. we fail 
to consider the coincidence of P and P’ unless 1, m,n =I’, m’, n’ respectively. 
Similarly, the point 
(kqz, qe; 0, 0) 


referred to tetrahedron H, lies on the z-axis and coincides with P for proper 
choice of k. Yet their geometrical characterizations and their dual planes are 
completely unrelated, and the curves they generate as wu varies (k,1,m,n 
remaining independent of w), are entirely different. 

The problem of characterizing geometrically the dual of a given point 
relative to T at 0 is completely untouched by the formal, analytic solution 
above. A special case of this problem, for example, is to determine how the 
point P, given above, is related geometrically to its dual plane under the 
assumption that 1, m, nm are arbitrary numerical quantities. The simplest 
special case of the general problem is that of characterizing geometrically the 
dual of a point whose codrdinates are expressible in terms of gg and q; alone. 
This problem is readily solved. For, we may write the codrdinates of such a 
point as | 

ys = 92) = % 
when referred to tetrahedron H,; hence the dual plane has codrdinates 
n= fi(ke, Kz) 
when referred to H.. By means of transformation (6) we find that this plane 
has the equation 
+ (81q6°%3 — Y2 + — + 4419697724) Ys 
+ — + 4419697723 + (54q6* — 24] ys = 0 
when referred to H,. We have therefore proved the following result. 

THEorREM 1. The dual, relative to T at 0, of a point whose coordinates 
are expressible in terms of qg and q; 1s the polar of the point with respect to a 
quadric () having the equation 
(10) + — — + 

(54q¢* 343q7°) ys” 0. 

Sannia has considered * a self-dual tetrahedron S whose vertices, referred 

to H,, are 


*“Nuova trattazione della geometria proiettivo-differenziale delle curve sghembe,” 
Annali di Matematica, IV, vol. 1 (1924), pp. 1-18; vol. 3 (1926), pp. 1-25. 


| 
ed | 
y. 
at 
1) 
id 
Ig 
r, | 
je 
1 
d 
18 


290 LOUIS GREEN. 


O, N(49q77, 28q6q7, 12q6”, 0), 

B(343q7* — 216q6*, 294¢697", 252q67qz, 216q6°). 
The quadric @ has three-point and three-plane contact with I at 0, and has 
among its rulings the edges OT, ON, BT, BN of Sannia’s tetrahedron. It is 
contained in a one-parameter family of quadrics having this property. 


4, Fundamental tetrahedra. In system H, the osculating conic K, 
of T at 0 is given by 
(12. 1) 4413 — = 0 = ys, 
and its dual, the osculating quadric cone K’,, has the equation 
(12. 2) — YoY. = 0. 
Any tetrahedron {OP,P.P;} with vertices defined in the following way will 
be called a fundamental tetrahedron of f at 0. One vertex is at 0, a second 
at an arbitrary point P,(3,t,0,0), £40, on the z-axis; a third vertex 
P,(3, 2t, 2,0) is the contact point of a tangent from P, to the conic K., and 
the fourth vertex P,(1— ¢, for an arbitrary value of lies on 
the contact line of the cone K’, with its tangent plane which passes through 1s. 
The tetrahedra H,, H., S are fundamental tetrahedra determined by the 
following values of t,a@: 0,03; 3q6/7q7, 246; moreover, the dual 
of any fundamental tetrahedron is another fundamental tetrahedron. 
Associated with the fundamental tetrahedra is a family of twisted cubics, 
Ta, having five-point contact with T at 0, and expressed parametrically by 
the equations 


All of these cubics belong to the same null system, lie on the cone A”,, and 
have K, as osculating conic at 0.° 

Jach choice of the point P,, or of the plane OP,P;, determines a subset 
of c1 fundamental tetrahedra; these can be placed in a one-to-one corre- 
spondence with the cubics 7’, by choosing the vertex P, as the intersection, 
besides O, of Tg and the plane OP,P;. When this is done, the following 
relations exist: 

The polars of the vertices of one of these tetrahedra with respect to the 
common null system of the cubics are the faces of the tetrahedra, the tangent 
to 7, at P; is the edge P,P, and the osculating plane to 7’, at P; is the face 
P,P.P;.8 As P, traces the z-axis, « remaining fixed, the edge P,/, generates 
a cubic surface with the equation 


5 Lane, Projective Differential Geometry of Curves and Surfaces, 1932, p. 29. 
*Su, “Note on the projective differential geometry of space curves,” Journal of 
the Chinese Mathematical Society, vol. 2 (1937), pp. 98-137. 


| 
a 
t 
( 
f 
t 
a 
( 
( 
| 
( 
| 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 291 


(14) YiYs” — + 2ys* + = 0. 
Tke dual of an arbitrary point on the cubic 7, can be shown to be the 


plane which osculates the cubic 7’ 
= 1 — Yo = T, Y3 == Y4 = 


for which 


at the point determined by 


t= 3q6l/(Vqrt 


THEOREM 2. The dual of the five-point cubic Tq is another five-point 
cubic Ty for which a a’ =2qs. The self-dual cubic is Tq, and the only 
points of this cubic which lie in their dual planes are the points O and B of 


Sannia’s tetrahedron. 


The self-dual cubic 74, harmonically separates, with 0, the points of any 
two dual five-point cubics. It is called the harmonic cubic by Fubini and 
Cech,? and the coincidence cubic by Kanitani.’ Theorem 2 shows that all 
five-point cubics are also five-plane cubics, and since 7’y is the six-point cubic, 


then the six-plane cubic is Tq(% = 2q¢). 


5. The principal plane of a curve. Halphen’s theorem ® on the princi- 
pal plane at a point of a curve has been extended by Bompiani !° and dualized 
by Sannia.4! We shall carry their results still further, basing all our cal- 
culations on tetrahedron J/,. 

Let the tangent developables of T and of a five-point cubic Tg be cut by 
an arbitrary plane OP,P;, (7) passing through the w-axis, and let the plane 
curves of section be denoted by IY and 7’q, the latter being a cusped cubic 


Then the following conclusions hold: 


THEOREM 3.1. If «a 4qo, the curves and have exactly six-point 
contact at O for all planes OP,P,. If «= 4qo, these curves always have just 


seven-point contact, with the single exception that the plane given by 


(15. 1) = = 0) 


produces curves having eight-point contact. 


‘Geometria Proiettiva Differenziale, vol. 1 (1926), p. 42. 

*“ Sur les repéres mobiles attachés a une courbe gauche,” Memoirs of the Ryojun 
College of Engineering, vol. 6 (1933), p. 106. 

*“Sur les invariants différentiels des courbes gauches,” Journal de VEcole Poly- 
technique, vol, 28 (1880), p. 25. 

“Sul contatto di due curve sghembe,’ Memorie della Reale Accademia delle 
Scienze dell’ Istituto di Bologna, ser. 8, vol. 3 (1926), pp. 35-38. 

Loc. cit. 


a’ == — 


292 LOUIS GREEN. 


If « and the plane of section OP,P; are both arbitrary, the cusp of T’, 
hes on the twisted cubic T, and is the vertex P; of the fundamental tetra- 
hedron determined by Tg and OP,P;. The cusp-tangent is the edge P,P, 
of this tetrahedron. 


Bompiani’s osculants 1? for the curve I’, which has an inflexion at 0 for 
arbitrary plane OP,P;, are obtained immediately. His fourth-order neighbor- 
hood of I” at 0 is the vertex P, of the fundamental tetrahedra determined 
by the plane OP,P3;, while his neighborhood of the fifth order is the edge OP,. 
His neighborhood of the sixth order is the point P; which lies on the five- 
point twisted cubic = 

We shall prove only the first part of our theorem. The tangent de- 
velopable of T has the parametric equations 


=u+, 
(16) + pri 0(2u+ 
+ qeu® + + gsu® +- 
+ v(3u* + + + -). 


It meets the plane OP,P3, whose equation is 


(17) ty—z=0, 
in a curve I’, represented by (17) and 
(18) y= — — 24a*/t? — 15625 /t® — (1072 + 128q¢t*)2°/t! 


— (7668 + 1728qof* + 


If the non-homogeneous equations of the cubic T, are written in series 
form, its tangent developable is seen to have the equations 


(19) y=u? — au? + + 0(2u— + -), 
z= u® — 2au® + -+ 7(3u? — + 63e7u8 +: - 
Its intersection with the plane OP,P; is a curve 7’, whose equations are 
(17) and 
(20) y= —4a°/t — 2424/12 — 1562°/t® — (1072 + 32al*) 2° /t! 
— (7668 + 480at*)27/t 


The desired results then follow from (18) and (20). 


12 Per lo studio proiettivo-differenziale delle singolarita,” Bollettino della Unione 
Matematica Italiana, vol. 5 (1926), p. 118. 

18Su, loc. cit. See also his paper, “On certain twisted cubics projectively con 
nected with a space curve,” Journal of the Chinese Mathematical Society, vol. 2 
(1937), p. 59. 


| 
( 
t 
( 
( 
0 

. 

a 
0 
a 
¢ 
| 
( 
f 
ij 
0 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 293 


Dually we choose a point P, (40) on the z-axis as the vertex of cones 
containing the curves and 


THEOREM 3.2. If — 2qe, these cones have exactly siz-plane contact 


along + for all points P,. If «—— 2q, the cones always have just seven- 
plane contact, with the single exception that the point with coordinates 
(15. 2) (247, qo, 0, 0) 


produces cones having eight-plane contact. 


The next step would be to consider planes through 0 which do not contain 
the z-axis. Let II be such a plane, cutting the tangent developables of T and 
T, in cusped curves I” and Ty”. 

THEOREM 4.1. The order of contact of 1” and T,” at 0 1s greater for 
a= 5q6/2 than for any other value of a, regardless of the position of the 
plane 11. When «= 5q¢/2, the order of contact is increased still further tf 
II contains the line whose equations are | 


(21.1) = 0 = 


and is greatest for a uniquely determined position of Il, namely 


(22.1) 3q6°Y2— + (18qep2 — + 2092") ys = 0. 

We prefer to prove the dual theorem, and shall state here without proof 
several additional results of Theorem 4.1. The osculating cusped cubics at 0 
of !” and T,” coincide if, and only if, « = 5q_/2, independently of the plane 
Il. For fixed but arbitrary « let Ta” be the osculating cusped cubic of 74”, 
and let If vary in the bundle of planes through 0. Then the inflexion point 
of T,” generates the ruled surface (14) while the inflexion tangent of 7,” 
forms a congruence with the following properties. The focal sheets comprise 
an algebraic surface S of the sixth order whose asymptotic curves are twisted 
cubics ; on each line of the congruence the harmonic conjugate of the inflexion 
point with respect to the two focal points lies in the plane m (y;—=0); the 
developables of the congruence meet S in twisted cubics and meet m in a 
family of conics whose envelope is the conic K,. When, in particular, the 
inflexion tangent of 7,’’ passes through a point P, on K2, then the plane II 
is the face OP.P., of the fundamental tetrahedron determined by 74 and P2, 
while the two focal points on this inflexion tangent coincide at the point Ps 
of this tetrahedron. 

To obtain the dual theorem, an arbitrary point P in z but not on the 
t-axis is chosen as the vertex of cones containing the curves T and Tq. For 
simplicity we cut these cones by the plane y, = 0, obtaining curves T and ,.. 
Then, 


4 
| 


294 LOUIS GREEN. 


THEOREM 4.2. IT and Tz have exactly six-point contact at 0 for all 
centers of projection P unless «= —qe/2. If qo/2, they have just 
seven-point contact unless P lies on the line whose equations are 


In this latter event the curves T and Ty have precisely eight-point contact 
at ( except when P has coordinates 


(22. 2) — 497°, — 2492, — 3qo°, 9), 
when they have nine-point contact. 
For, let T and T' be projected upon the plane y; = 0 from the point P 
(1, m,n, 0) (nA 0). 


It can be verified without difficulty that in the plane y; = 0 the projections 
of T and of 7 have the respective equations 


+ 3mat*/n + (9m? — + (28m* — 13mn + 
+ (90m* — 64m?n + 5n? + + 
+ (297m> — 285m*n + 5lmn? + 27qem?n* 
— dqen* + Taz,mn* + gsn’)a?/n? +° °°, 
z= + 3mat/n + (9m? — 2n)x*°/n? + (28m* — 138mn — 2an*)a*/n® 
+ (90m* — 64m?n + 5n? — 15amn*)a2*/n* 
+ (29%m> — 285m*n + 51mn? — 8lam?n® + 


The results follow immediately from these equations. 
The Halphen-Bompiani theorem referred to above states: 


THEOREM 5.1. The locus of points projecting T and the six-point cubic 
T, into cones having at least seven-plane contact is the principal plane of 
Tat 0: 
(23.1) 


If the center of projection lies on the line with equations 


(24. 1) + Prys = 0 = Ys, 


these cones have at least eight-plane contact along the principal plane, while 
for a unique point W on this line nine-plane contact is obtained. 


Theorem 3.2 shows that all points on the z-axis, except 0 possibly, must 
be excluded from the locus. Further examination indicates that the cones 
projecting T and 7) from 0 have but five-plane contact along 7, so that 0 
must also be excluded. 

The dual of this theorem is 


i 
1 
I 
( 
a 
| 
| 
| 


D 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 295 


THEOREM 5.2. All planes'* passing through the principal point of 
Tat 0: 
(23. 2) (7q:, 0, 0), 
intersect the tangent developables of T and of the six-plane cubic Ta(a = 2q¢) 
in curves having at least seven-point contact. No other planes possess this 
property. If the plane of section contains the line whose equations are 


(24.2) — + — + 4297") = 0 = Ya, 
the curves have at least eight-point contact at the principal point, while for 
a unique plane Q through this line nine-point contact is obtained. 

The tetrahedra //, and Hy, are characterized geometrically by the fol- 
lowing dual theorems. 

THEOREM 6.1. The Halphen point H of T at 0 is the point of inter- 
section, besides 0, of the six-point cubic T, and the principal plane of TY. 


Its coordinates are 

(25.1) (0, 0, 0,1); 

the equation of the osculating plane @’ to T, at this point is 
(26. 1) y, = 0. 

THEOREM 6.2. The Halphen plane © of T at 0 ts that osculating plane, 
besides +, of the six-plane cubic Tq which contains the principal 
point (23.2) of T. Its equation is 
(25.2) 27 — 189q67qry2 + 441 + (54q6* — 343q7°) ys = 95 
the codrdinates of the contact point H’ of this cubic and this plane are 
(26. 2) (3439;° — 54q¢6°, 63q6°97; 276°) 

Furthermore, the lines HH’ and 0’ are coplanar with the edge ON of 
Sannia’s tetrahedron,’® and together with the self-dual cubic Ty, serve to 
characterize this tetrahedron. 

The principal plane of T and of a five-point cubic T, (#0) is their 
common osculating plane z, while dually the principal point of T and of a 
five-plane cubic ~ is their common point 0. Bompiani’s extension 
of Halphen’s theorem was obtained only when the principal plane of the two 
curves is distinct from the common osculating plane, while Theorems 3. 2 and 
4.2 cover the case where the two planes coincide. The complete results can 


thus be summarized as follows: 


“With the exception of the planes through the x-axis which must be excluded. 
© See footnote 7. 


ll 
st | 


LOUIS GREEN. 


The twisted cubic T, determines an axis of Bompiani (24.1) and a 
point *® of Bompiani W in the principal plane (23.1) of T; the cubic 
Ta (%—=—qo/2) determines an axis of Bompiani (21.2) and a point of 
Bompiani (22.2) in the osculating plane x, and the cubic Tg (a = — 2q,) 
determines a point of Bompiant (15.2) on the tangent to T at 0. Dually, 
the cubic T, (a= 2q5) determines a ray of Bompiani (24.2) and a plane 
of Bompiani Q through the principal point (23.2) of T; the cubic for which 
a= 5q_/2 determines a ray of Bompiani (21.1) and a plane of Bompiani 
(22.1) through the point 0, and the cubic with «a= 4q_ determines a plane 
of Bompiani (15.1) through the tangent to T at 0. 


When a five-point cubic T, is projected upon the plane y; = 0 from a 
point P in the plane z, the projected curve is a cusped or a nodal cubic 
according as P is or is not on the conic Kz. In either case an inflexion point 
is obtained at 0. Now, it follows from Bompiani’s work ‘7 that a plane curve 
with an inflexion point sustains at this point a seven-point cusped cubic and 
«1 eight-point cubics (two of which are nodal), but that ordinarily it possesses 
no eight-point cusped cubic and no nine-point osculating cubic. If the curve 
does happen to have a nine-point cubic, then every eight-point cubic is a 
nine-point cubic and there exists a unique ten-point cubic. 

Theorem 4.2 therefore states that the point where the line (21.2) meets 
the conic K,, namely 
(27) 2647; 0), 


projects T into a curve in the plane y; 0 which sustains an eight-point 
cusped cubic, and that the point (22.2) projects [ into a curve sustaining 
a ten-point cubic. The points (22.2) and (2%) are not the only points in 
the plane 7, however, with these properties. It can be shown, for example, 
that the locus of all points in x projecting T into a curve in the plane y; =9 
which sustains a ten-point cubic is the straight line joining the points (15.2) 
and (22.2). 

Theorems 3.1, 4.1, and 5.2 are concerned with plane sections of the 
tangent developables of T and of 7’, and show that for properly chosen cubics 
Tq there are certain planes which yield curves of section having contact of 
higher orders than are obtained ordinarily. It is natural, then, to inquire 
into the nature of the curve of intersection of the tangent developables of T 
and of T4. The results are contained in the following theorem: 


16 We prefer this terminology to “ principal point ” since we wish to use the latter 
for the dual of the principal plane. 
17 See footnote 12. 


296 
| 
| 
h 
li 
Al 
0 
Bp 
01 
té 
la 
ol 
. | W 
T 
(| 
(( 
Ir 
W 
(¢ 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 


THEOREM 7.1. The tangent developables of T and of Tq intersect in 
the a-axis and in a residual curve C. If a= qe, then C consists of a single 
branch having four-point contact with T at 0. If «A qe, then C has four 
branches three of which are linear, each of the three passing through 0 and 
having four-point contact with T. If «== 2qe, the fourth branch of C does 
not pass through 0 but through the principal point (23.2). This branch is 
linear and has for tangent and for osculating plane at the principal point the 
ray of Bompiant (24.2) and the plane of Bompiani Q. If a= 5q6/2, the 
fourth branch of C has a cusp at 0 with the ray of Bompiant (21.1) as 
cusp-tangent and the plane of Bompiani (22.1) as osculating plane. If 
a=4qe, the fourth branch of C has an inflexion point at 0 with the «x-axis 
as tangent and the plane of Bompiani (15.1) as osculating plane. For all 


other values of « the fourth branch of C is linear, passes through 0, has two- 


point contact with T, and has m as osculating plane. 


To prove this theorem we equate the space codrdinates (16) of a point 
on the tangent developable of T to the space codrdinates (19) of a point on the 
tangent developable of 7, after first replacing the parameters u, v in the 
latter set of equations by p, v. Elimination of v and v from the three equations 
obtained yields the equation 

mux, 
where, in particular, 
— 2(%— qo) ] = 9. 
Three cases thus arise: 


(a) Me? = 2(a— qo) 0; 
(b) Mo 0, = Ys 5 
(c) m, = 0, Ye. 


In case (b), 
Mes 

while in (c), 

m; = (), Ms = a(2a + qe) /2(a— qo), = «(4% — Go) q7/2(% Ge)’, 

Me = — Ge) qr” + (%— Go) { — — 8") pr 

+ (5a? — 2aqe) qu} ]/2(a— 

We can now express v in terms of wu. 
(a) v= 


(c) three subcases must be considered: 


297 
5 


298 LOUIS GREEN. 


aA v= (a— Ge) U/3(2q6— @) 
+ (qo — 4a) + our 
where 
o = [3 — 2q6) (5a — 242) 4x — 9(a— 2qG6) (4% — qo) pr 
(c:) 
(Cs) V = Go/Tq7 + (2146p; — — u/49q7? 


Substituting into (16) yields the equations 


(a) y=wimu+t---, 
(b) y = uw? — 
t= — 2a) u/3(2qu— + (ge — 4a) + ou? +: 
(c:) y¥=(4q¢e—a) (2qe—a) + 2 (Ge—4a) + 2out +: 
z= qe? /(2qg—%) + (Ga — 4%) + 


(cz) y = 2u?/3 + 
= + — 8464s + 42977) u/49q7? +° °°, 
+ (42qep2 — 16qeqs + 35977) u?/49q77 
z= + (63q6p7 — + +° °°. 


The theorem follows readily from these equations. 
The dual theorem is 


THEOREM 7.2. The planes containing both a tangent to T and a tangent 
to T, form an azial pencil through the x-axis and a residual family C’ of 
planes, whose edge of regression is a curve CO”. If «= qe, then C” consists 
of a single branch having four-plane contact with T at 0. If «A qo, then 
C” has four branches three of which are linear, each of the three passing 
through 0 and having four-plane contact with T. If «0, the fourth branch 
of C” passes through the point of Bompiani W of Theorem 5.1, and has for 
tangent and for osculating plane at W the axis of Bompiani (24.1) and the 


principal plane (23.1). If «——4qo/2, the fourth branch of C” passes 
through the point of Bompiani (22.2) where it has the axis of Bompun 
(21. 2) as tangent and the plane x as singular osculating plane. If «= — *4o 


the fourth branch of C” passes through the point of Bompiani (15.2) where 


| 
i 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 299 


it has the x-axis as tangent and the plane x as osculating plane. For all other 
values of « the fourth branch of C” is linear, passes through 0 and has two- 
plane contact with TY. 


There are certain interesting relations between these two dual theorems, 
but we shall not consider them here. 


6. Osculating quadrics at a point of a curve. The self-dual quadric 
(10) was found to have both three-point and three-plane contact with I at 0. 
Although there are o° quadrics with this property, there is no non-singular 
quadric having both four-point and four-plane contact with T at 0. 

The general six-point quadric of T at 0 has the equation 


(28) Yo")  M2(YiYs — + Ms — Yoys) + mays? = 0, 
where the m; are arbitrary. If this quadric contains the five-point cubic Tq, 
then 


am, = 0 = am, — Ms, 
while if the quadric is to have seven-point contact with T at 0, then 
+ Ms = 0. 
Hence we have the following theorem characterizing the cubic Tq, 


THEOREM 8.1. There exists a one-parameter family of quadrics having 
at least six-point contact with T at 0 and containing the fwe-point cubic 
T,(a40). Neglecting the quadric cone K’,, seven-point contact is obtained 
if, and only if, «= — qe. 


The dual theorem, which will be omitted, characterizes the cubic 
T, (a= 3q,). Another characterization of Tg, is due to The co? 
quadrics having seven-point contact with T at 0 have in common just two 
points —( and the residual intersection 


(qo, 0, 0, 1) 


of the line OH (25.1) with the five-point cubic Tq, Dually, the 0? quadrics 
having seven-plane contact with T al 0 having in common just two tangent 
planes — and the plane 


27 — + 4419697742 + (81q0* — 34392") ys = 0, 


which is coaxial with w and ® (25.2) and osculates the five-plane cubic 
(a= 


8 Loc. cit. 


i 


300 LOUIS GREEN. 


The fundamental tetrahedra are related to the seven-point quadrics 


according to the following theorem. 


THEOREM 9.1. Hach point P, (0) on the osculating conic K, de- 
termines a unique seven-point quadric whose intersection with x is a conic 
tangent to K, at 0 and at P.. The tangent plane to the quadric at P, 
osculates the six-point cubic T,, the tangent plane to the quadric at 0 is the 
face OP,P;, common to all fundamental tetrahedra which have P, for a vertea, 
and the polar with respect to the quadric of the vertex P, of these tetrahedra 
is the face OP P3. 


If, in particular, the point P, is chosen at (0,0,1,0), then the quadric 
reduces to the cone 
(29) y= 2° 
with vertex at the Halphen point. 

The dual theorem states: 


THEOREM 9.2. Hach plane OP.P,; (Az) tangent to the osculating 
quadric cone K", determines a unique seven-plane quadric whose cone of 
tangents through O touches Kk", along w and along OP2P;. The contact point 
of the quadric with the plane OPP, lies on the siz-plane cubic Tq (% = 244), 
the contact point of the quadric with x is the vertex P, common to all funda- 
mental tetrahedra determined by the plane OPP, and the pole with respect 
to the quadric of the face OP,P, of these tetrahedra is the verter P». 


7. Consecutive configurations. A manifold M geometrically defined 
for each value of the parameter wu of the curve [ generates or envelopes 
another manifold M’ as u varies. Several five-point twisted cubics can be 
characterized in this way. 

As wu varies, the five-point cubic 7’. (2 = const.) generates a surface Sa 
and the osculating planes of the dual cubic Tq (a+ @ = 2q¢) envelope the 
dual surface Sq. The tangent planes to Sq along 7, form a developable Da, 
while dually the osculating planes of Ta are tangent to Sq along a curve Da. 


Then, 


THEOREM 10.1. Dg is the tangent developable of a twisted cubic except 
in the following four cases. If «——4qo, Da is a cubic cone with vertex al 


the point whose coordinates are 
(30. 1) (1497, 5q6, 0,0). 
If @ = — 2q¢/3, Da is a cubic cone with verter at a point, 


(31.1) (49977, 70geqz, 0), 


| 

i 


Ne 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 301 


lying on the osculating conic K.. If Dg ts the quadric cone (29). 
If «= 6q6, Da is the osculating quadric cone K's. That is, the characteristic 
curve at 0 determined by the set of all osculating cones K’s is the cubic 
T,(%=6q¢) (and the x-axis) ; stated differently, the generators of K’, form 
a congruence as the parameter u varies, and the surface Sq («= 6q6) is one 
focal surface.°° The cubics Tg on Sq have an envelope other than YT if, and 


only if, = 6qo, the contact point on the envelope associated with 0 being at 
(343973 — 750q6*, 245q6q7", 175q67q7z, 125q6°). 


The proof runs as follows. A point P on T near 0 determines a canonical 
tetrahedron H/,(P). If local homogeneous codrdinates referred to this tetra- 
hedron are denoted by Y;, then the equations of the cubic 7’ associated with 
are 

Now let P have non-homogeneous codrdinates (h, k,l) referred to tetrahedron 
H, at 0. From (9) and the equation 
Yi=yi t+ 

we have 

pV 1 = — (21 + °°, 

= — (2qoy2— 14q7y2/3 + 

PY = — — )h +-° 


Solving for y; and using (32) we obtain 


= Yo(1 — ar*) + (21 + 

AY2 = + (Ya— + 14 pyr? + — ager*)h+- 
= Yor? + — 14G777/3 + 

As = + (8467? — 


as parametric equations of the surface Sg. When h=0, 7=14, the tangent 


(33) 


plane to S, has the form 


— a) — + — ly, + (6967 + — 14aqzt) ly 
— — — ys = 0. 


* Newton, loc. cit. Also Tsuboko, “On the locus of the space cubics osculating a 
space curve,” Memoirs of the Ryojun College of Engineering, vol. 10 (1937), pp. 63-74. 

* Wilezynski, “General projective theory of space curves,” Transactions of the 
American Mathematical Society, vol. 6 (1905), p. 109. This result has also been 
obtained independently by Kanitani and Newton, loc. cit. This cubic has been called 
the torsal cubic of T at 0 by Wilczynski. 


| 
’ 


302 LOUIS GREEN. 


As ¢ varies, this plane envelopes the developable surface Dg whose edge of 


regression has the parametric equations 


Yi = — 45965 (246 + 3%) + + + grt — 147027 
+ [686a°q;* + 9aqu*(2G6 + 3a) (4q6 + %) (646 — a) 
Yo = — 45496" (2qu + 3%) — a) t + 2102796? — a) 
— 98a7q6(6q5 — %)q77l*, 
Ys = — 454q6° (446 + %) (6G6 — %) + 2laqe? + «) (6¢6 — 
ys = — (2qo + 3a) + 2) (64s — a) 


These equations yield all but the last statement of the theorem. To complete 
| 


the proof we set 
in (33) and find 


(1 — — — + ar*®)h+---, 

t= r+ (1 — — — — 2ar* + 
— — Gager® a°r®)h/(1 + 2ar?) 


The desired result then follows immediately. 
The dual theorem states: 


THEOREM 10.2. The curve Dy is a twisted cubic except in the following 
four cases. If « = 6q¢, Da is a curve of class three lying in the plane dual to 
the point (30.1). If a = 86/3, Da is a curve of class three lying in the 
plane dual to the point (31.1). If & = 2q6, Da is a conic lying in the 
Halphen plane (25.2). If a =—4qo, Da is the osculating conic K,; thal 
is, the surface Sq is the locus of all osculating conics K.;*! stated differently, 
the tangents to K, form a congruence as the parameter u varies, and the 


surface Sq (a = —4q,) is one focal surface. 


8. Projections of a space curve. When the space curve T and one of 
its five-point twisted cubics 7’, are projected from an arbitrary point P(h,k,!), 
1A 0, upon the osculating plane x at 0, the projected curves I’, 7’q possess 
certain interesting relations. 

The equations of I” are readily found to be 


(34) y = — + 2hat/l— (8hk + 1)2°/P? 
+ (th? + 2k +: 


while those of 7", are 


21 Tsuboko, “ On the locus of the conics osculating a space curve,” Memoirs of the 
Ryojun College of Engineering, vol. 10 (1937), pp. 11-17. 


| 

0 

fl 

n 

Dp 

[3 

H 

( 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 303 


(35) y = — + 2hat/l — + 1+ al?) /P 
+ (th? + 2k + 


These curves always have at least five-point contact, and hence have at 0 a 
common osculating conic K. Higher-order contact is obtained only when 
a=(), as seen from (34) and (35) or from the theorem that the principal 
plane of T and 7, (#0) is the osculating plane z. 

The equations of K are 


(36) y =v? —kary/l + (2hl—k*)y/P, 


This conic has six-point contact with I” if, and only if, 


(37) py SPP + 2h? — 3hkl = 0, 
and has six-point contact with 7’, if, and only if, 
(38) = -}- al? = (), 


i.e. if, and only if, the center of projection P lies on the cubic surface (14). 

If P traces a line L through 0, the conic A remains unchanged. More- 
over, we can readily prove 

THEOREM 11. The conics K and Ky» have double contact if, and only tf, 
the cenler of projection P lies on the quadric cone K’,. If this is the case, 
the line OP and the chord of double contact of the conics are edges of a 
common fundamental tetrahedron. 

We shall be concerned in this section with the projective normal and the 
flex-ray of I” at 0,°* and in order that these be well-defined we must have a 
non-composite osculating nodal cubic of IY at 0, which means that we must 
assume that wp, 0. This assumption furthermore prevents the center of 
projection from lying on the six-point cubic 7’. 

The osculating nodal cubic of I’ at 0 can be shown to have the equations 

+ (pi? + k?v, — 2hly,)y* = 0 =z, 
where », is given by (37) and 
vy, = 8hk?l — h?l? — 2hl? — dk* — qokl*. 
Hence the projective normal of I” at 0 is expressed by 
(39) 
while the flex-ray is seen to be 


ects flex-ray of I” at 0 is defined as the line of inflexions of the osculating nodal 
cubic of IY at 0. 


— 


304 LOUIS GREEN. 

(40) Py? + + (hk? — + v1?) y = 0 =z, 
In the same way the projective normal at 0 of 7’, is of the form 

(41) pot + voy = 0 =z, 

where ps is given by (38), and 


= 8hk?l — h?l? — 2hl? — — 2akb, 


while the flex-ray of 7’, at 0 has the equations 


(42) + + 2v2) (ke? — + v2") y == 0 =z. 


THEOREM 12. As the center of projection P traces a line L through 0, 
the projective normal of TY varies in the pencil at 0 unless L is a generator 


of the quartic cone 
(43) (y? — az)? + = 0. 
Neglecting the x-axis which must be excluded, this cone meets the quadric 
cone K’, in the z-axis. As P traces the z-axis, the projective normal of T 
at 0 remains coincident with the y-axis. 

As P traces a line L through 0, the projective normal of T’g varies unless 
L is a generator of K’,. As P traces a generator OP; of K’s, the projective 
normal of 1’, al 0 coincides with the edge OP. of the fundamental tetrahedra 
determined by 

The proof offers no difficulties. In (39) and in (41) we replace (h, /,1) 
by homogeneous codrdinates (h,,- --,hs) and demand that the resulting 
equation be independent of h,. In both cases we obtain 


— 2hzy = 0 =z, 
so that 
2hspi + havi = 0 2). 

Replacing p; and vj by their values we have the results immediately. 

The locus of all centers of projection P determining curves I” which 
have at 0 a fixed projective normal, say 
(44) r+ my=—0=—z, 
is a fourth-order surface S*,, determined by the condition 
(45) I,m — 0. 
The only five-point twisted cubic T, which lies on this surface is the one for 
which «= q./2, this situation occurring only when the given projectwe 


normal of TY at 0 is the y-aris. 


) 
h 
fl 
1 
1 
| 
| 
| 
—— | | ( 


TWISTED CUBICS ASSOCIATED WITH A SPACE CURVE. 305 


If the projective normal of 7’, is chosen as the same line (44), then P 

J 
generates another fourth-order surface S*q determined by the condition 
(46) lp2m — ve = 0. 
The surfaces S*, and S*q coincide if, and only if, both a= q6/2 and (44) 
is the y-axis. If «—4q,/2 but (44) is not the y-axis, these surfaces meet 
only in the plane z so that there is no center of projection yielding coincident 
projective normals at 0 for IY and 7%. If aA gqe/2 and (44) is the y-axis, 
the surfaces meet in the z-axis. In all other cases the surfaces intersect in a 
non-composite conic. 

From (45) and (46) we obtain 
(47) l= (qe — 2u)k/am, 
and upon substituting into (45) we find as the equations of the conic 

(6 — 2a) y — amz = 0, 
2%)*mz + (qe — 2a) *x? — am? (qo — 2a)? + 2a) az 
+ am[a?m® + + = 0. 


As m varies, % remaining fixed, these conics generate a surface whose equa- 


tion is 
(48) + ae? — + 2a) aye + (2q0+ 
(a0). 
T = 0 
This surface is composite if « = — 2q.6, and degenerates when « = 0 into the 


plane y= 0, as is evident from (47). 

The flex-ray of I’ at 0 is tangent to the osculating conic K of IY at the 
point of intersection of tlex-ray and projective normal. A similar statement 
holds for 7’,. Hence, as P traces a line L through 0 the envelope of the 
flex-rays at 0 of the curves I” is the conic A unless L lies on the cone (43). 


The following theorem can be readily proved. 


THEOREM 13. Let the center of projection P trace a five-point cubic 
T, (a0). If «= 2q6/3, the flex-rays at 0 of the curves Y form a pencil 
through the point (0,1,0,0). If «= qo/3, or if «= qo, the flex-rays all pass 
through the point (0,0,1,0). In all other cases the flex-rays at 0 of TY” 
envelope a non-degenerate conic. This conic has three-point contact with K, 
at 0 if, and only if, a = 4q6/3. 


If P traces a five-point cubic 7’g, the flex-ray at 0 of 7%. (a~B) 
envelopes the conic K» for all values of 2, p. 
Let the flex-ray at 0 of IY be a given line, say 


(49) ra + sy +1=—0=—2z. 


306 LOUIS GREEN. 


Then from (40) we find that the center of projection P lies on a space curve 
Cy which is the intersection of a quadric cone 


(50) 8hl — 5k? + 2rkl — (r? — 4s)? =0 


and a quadric surface 
where 


wo = + 25rl + 42h? + 131rhk + (7? —4s)hl + (127? + 160s) k? 
— (267° — 104rs) kl. 


Hence Cp has a node at 0 with the x-axis as one tangent. If the residual 
tangent at 0 lies on the cone K’., then the flex-ray (49) is the edge P,P, 
of the fundamental tetrahedra determined by the residual tangent, and con- 
versely. Ordinarily Cp is a quartic curve, but under the condition 


(r? — 3s)? — 12qer = 0 


it consists of a generator of the cone (43) and a twisted cubic. 

If the flex-ray at 0 of 7’, is chosen as the same line (49), then the 
center of projection P lies on a curve C, which is the intersection of the cone 
(50) and a quadric surface 


wo + T5akl + 2arl? = 0. 


The curves Cp and Cg coincide if, and only if, both «= 2q6/3 and the flea- 
ray (49) passes through the point (0,1, 0,0). 


If «= 2q6/3 but the flex-ray does not pass through (0,1, 0,0), there 
are no centers of projection P yielding coincident flex-rays at 0 for IY and 7"o. 
If «A 2q-/3 and the flex-ray is the line y,; = ys = 0, then the locus of P is 
the z-axis. If «+4 2q,/3 and the flex-ray passes through (0,1, 0,0) but not 
through (0,0,1,0), then there are no points P. In all other cases there is a 
unique center of projection P which determines coincident flex-rays at 0 for 
IY and 7’,, the locus of these points P being the surface (48). 

Results of interest can also be obtained by. studying other elements asso- 
ciated with IY and 7’g, such as the focal point on the projective normal ” 
or on the flex-ray, the Halphen point, the condition for a coincidence point 
at 0, ete. 


INDIANA UNIVERSITY, 
BLOOMINGTON, INDIANA. 


28 Tsuboko, “Sur la courbure projective d’une courbe,” Memoirs of the Ryojun 
College of Engineering, Inouye Commemoration Volume (1934), pp. 59-74. 


nu 


th 
in 
tr 
ul 
fo 
an 
of 
| ar 
bu 
hic 
i 
q 
| | 
| ( 
i W 
v 
h, 


THE UNLOADING PROBLEM FOR PLANE CURVES.* 


By Parrick Du VAL. 


This paper relates to an earlier one of mine in this Journal,’ and uses 
the same notation. In particular, Clarendon type indicates matrices, capitals 
being used for rows or columns of geometrical entities and small letters for 
numerical matrices. (By an oversight for which I apologise, E was used 
instead of e in the former paper for the unit or identical matrix.) The 
transpose of a matrix is indicated by ~; any inequality between matrices is 
understood to apply to corresponding elements, i.e., h = k means hag = kag 
for all a, 8, and in particular h = 0 means hag = 0 for all a, B. 

It is familiar that if O = (O,---O,) isa set of distinct points in a plane, 
and h = (h,- an arbitrary row of non-negative numbers, then curves 
of sufficiently high order exist having multiplicity ha in Oa (2 = 1,:°-,9¢), 
and no other multiple point, and not all passing through any other one point ; 
but that this is no longer true if some of the points O are in the neighbour- 
hoods of others, unless certain inequalities, which we call the consistency 
conditions, are satisfied by the assigned multiplicities h. In fact, since the 
multiplicity of a curve in any point is the sum of its multiplicities in points 
proximate to that,* the consistency conditions are 


V 


0 


(1) mh = 0, 


where m is the matrix defined in the paper referred to, in which 


Map = | if 
= -— 1 if Og is proximate to Og, 


0) otherwise ; 


we shall call it the proximity matrix of the points O. 

It is also tolerably familiar that the conditions of having multiplicity 
ha in Og and multiplicities hg: - -he in Og: - -O¢ (proximate to Og) are 
lormally satisfied by curves whose actual multiplicities in these points are 


he+1,hg—1---he—1 respectively ; since these conditions reduce essen- 


"Received August 15, 1939. 
1P. Du Val, American Journal of Mathematics, vol. 18 (1936), p. 285. 
*F. Enriques and O. Chisini, Teoria gcometrica dellé equazioni, vol. 2, pp. 425-438. 


307 


308 PATRICK DU VAL. 


tially to having at least hg coincident intersections with every simple branch 
through Og, at least ha + hg with every simple branch touching O,0g, ete. 
This is the unloading (scaricamento) principle of Enriques* in its simplest 
form. The alteration in the multiplicities consists of adding to the row h the 
a-th row of the matrix m; so that generalising the process we clearly have: 


(2) The conditions of having multiplicities h in points O are formally 
salisfied by curves whose actual multiplicities there are 


h+ xm, x= 0. 


This we may regard as the general statement of the unloading principle. 

The question arises, if we attempt to impose on a curve multiplicities 
not satisfying the conditions (1), what multiplicities will it in fact have? 
Enriques* asserts that this question can always be answered by the applica- 
tion of the unloading principle, and of another which he calls that of 
smoothing or evening (scorrimento). The latter is in fact the solution of the 
problem, as far as it concerns a set of points consecutive on a simple branch, 
and with only one of the inequalities (1) unsatisfied, namely that which 
relates to the last point but one of the sequence. The attack on the problem 
in general is not given explicitly, but is illustrated by a comparatively simple 
example, though it seems to have been generally regarded as clear that a solu- 
tion can always be arrived at by a finite number of unloadings and smoothings. 
What I shall now shew is that, given perfectly arbitrary proximity relations 
between the points, and perfectly arbitrary assigned multiplicities, h, we can 
always find a set of numbers k such that: 


(a) k are consistent actual multiplicities; i.e., satisfy (1). 


(b) Curves with multiplicities k formally satisfy the conditions for having 
the assigned multiplicities; ie, k—=h+am, x= 0. 
(c) All curves satisfying (a), (b), are formally contained in the system that 


will be found. 


Eliminating k between the conditions (a), (b), it is clear that the 
inequalities (1) reduce to 
m(mx +h) = 0, 
which we rewrite in the form 


(i) az 0; 


*F. Enriques and O. Chisini, /bid., vol. 2, pp. 425-438. 
‘F. Enriques and O. Chisini, /bid., vol. 2, pp. 425-438. 


( 
\ 
( 
( 
| 
4 
( 
| 
| 
| 
| 


TR 


ens 


THE UNLOADING PROBLEM FOR PLANE CURVES. 309 


where é = mh, and a= mm = — n is the negative of hte intersection matrix 
of the diminished neighbourhoods L of the points O, as explained in my 
former paper. What remains of the Condition (b) is of course just the 
inequalities 

(ii) a= 


while the Condition (c) clearly means that the solution x of the inequalities 
(i), (ii) which we seek is such that if x’ is any other solution, 


x’ x. 


The matrix a@ is symmetrical and positive definite, and being the negative 
of an intersection matrix of distinct irreducible curves, has no positive element 
off the diagonal. It is of unit determinant, and thus a! also consists of 
integers; we prove first of all that a’ = 0. An algebraic proof of this was 
first given by Coxeter,’ some years ago. I subsequently noticed that the result 
is equivalent (interpreting the matrix as the scalar product matrix of a set of 
vectors in Euclidean space) to the theorem that a spherical simplex which has 
no obtuse dihedral angle has no obtuse edges either; and of this it is not hard 
to construct an elementary trigonometrical proof. The simpler argument 
which I give here was suggested to me by Mahler.® 

To say that a! = 0 is the same as to say that az = 0 implies z= 0. 
Suppose if possible that some of the z’s are negative, whereas az = 0; and let 
2’ be the row of just those of the z’s that are < 0, the rest being omitted, and 
@ the diagonal minor of a obtained by omitting the rows and columns corre- 
sponding to the columns omitted in 2’. Then «a fortiori a’z’ =0, since the 
terms omitted are all of the form dagzg, where z < 0, 2g = 0, so that «8 
and dgg = 0. Consequently 2’a’z’ = 0, which is impossible, since a’, being a 
diagonal minor of the positive definite matrix a, is itself positive definite. 

We conclude that 


(3) If y is chosen to satisfy the inequalities 


y= 0, yt+c=0, 
then x given by 


is @ solution of the inequalities (i), (ii). 


°H. S. M. Coxeter, Annals of Mathematics, vol. 35 (1934), p. 601. 
In conversation. 


h 
y 

f 
1 
| 
| 


310 PATRICK DU VAL. 


(Clearly the second of these conditions is identical with (i), and the 
first implies (ii) ). 

We next observe that if 2*, is the least value of zg in all solutions x of 
(i), (ii), then the row of numbers «* is itself a solution. (The proof of this 
was suggested to me by Rado.”) For (ii) is clearly satisfied; and as regards 
the a-th of the inequalities (i), if 2” is a solution in which 2’, = 2*q, we have 


= 


since the «-th term is the same on both sides, and in every other term 
dag = 0, x*g = a’g; and hence of course 


+ Ca = 2 + Ca 
= 0. 
In other words, 


(4) There exists a solution x* of the inequalities (i), (ii), such that if x is 


any other solution of them 


Putting x* for x in (b), we obtain the value of k satisfying (c). 

An explicit formula for x* in terms of a,¢ is not easy to find. I am able 
only to give a method of finding it which involves a finite process of trial. 
For this it is convenient to drop the restriction on x to be a row of integers, 
and consider instead a row 2% of real numbers. The foregoing argument is 
practically unaltered, and we conclude that there is a minimum solution 2* 
of the corresponding inequalities 


(i’), (ii’) az+c= 0, 2=0; 

now Erdés * has remarked that for every a, either z*,=0 or > dapz*p = 9; 

for if z is a solution in which zg > 0, } dagzp > 0, then 2 can be diminished 
B 


without destroying either of these inequalities, the rest of (ii’) will be 
unaffected, and all the rest of (i’) will be strengthened, since in each of them 
the coefficient of z¢ is =0. Thus we see that 


(5) If # is the row obtained from 2* by omitting all elements that vanish, 
and a’,c’ are obtained from a,e by omitting the rows and columns corre- 


sponding to these, then 
+c’ =—0. 


In conversation. 
8 In conversation. 


i 

Fy 


he 


of 
Vis 
ds 


ve 


THE UNLOADING PROBLEM FOR PLANE CURVES. 311 


Another form of this result is the following: 
(6) — — be, 


where b is a matrix in which certain rows and the corresponding columns 
consist entirely of zeros, and the diagonal minor obtained by omitting these 
is the inverse of the corresponding minor of a. 


I have not been able to find any direct criterion to determine which are 
the vanishing 2*s. It is easily seen that if zg = 0 in a solution of (i’), (ii’), 
then ¢q = 0, ha = 0, and gg = 0 (where g = m'h=a'e) ; these conditions 
however are only necessary and not sufficient for the vanishing of zg. In an 
actual case we should first try putting bag = bga = 0 for all values of « satis- 
fying these conditions, then for every combination of all but one of these 
values, and so on, constructing each time the matrix b and the row 2* in 
accordance with (6); the rows we obtain will all satisfy (ii), and the first 
one to arise satisfying (i’) also, is in fact z*. From this of course we obtain 


x* (which is what we really want) by the obvious relation 


(7) a*q ts the least integer which is = 2*q. 


THE UNIVERSITY, 
MANCHESTER. 


| 
4s 
4 


A COMPLETENESS THEOREM.* 
By R. P. Boas, 


1. Introduction. This note and the following one developed out of 
the problem of proving that the set of functions 


(1.1) (n=0,+1,+2,:--) 


is complete in L?(—2,7). This problem is equivalent to the problem of 
showing that an entire function F(z) of the form 


(1.2) F(z) f(t) 


is identically zero if F(2n) = F’(2n) = 0, n=0, +1, + 2,:--; this 
theorem is easily proved.’ If the original problem is generalized by replacing 
the multiplier z in (1.1) by a more general function G (2), it is more satis- 
factory to attack the problem directly; some uniqueness theorems for entire 
functions can be obtained as corollaries. In the following note, on the other 
hand, it is the uniqueness theorem which is generalized; the two kinds of 
generalization lead in different directions, and are studied by different methods. 

In this note, I shall establish the following completeness theorem, which 


is quite easily proved once the correct formulation has been found. 
THEOREM 1. Let G(x) The set of functions 

(1. 3) (n=0,+1,+ °°) 

is complete in L?(—-a, 7) if and only if 

(1. 4) G(a#+7)+ 0, 


except perhaps on a set of measure zero.* 


* Received November 10, 1939. 

1 Most of the results of this note were obtained while the author was a National 
Research Fellow. 

2 It is contained in Theorem 1 of the following note (“Some uniqueness theorems 
for entire functions,” American Journal of Mathematics, vol. 62 (1940), pp. 319-324). 

* Since completeness and closure in L* are equivalent properties, any element of L* 
can be approximated, in the metric of L*, by a sequence of linear combinations of the 
functions (1.3). If (1.4) is replaced by the stronger condition that @(a#) is essel- 
tially bounded and | G (a +7) almost everywhere, the functions 
(1.3) are easily shown to have the stronger property that any element of L* can be 
expanded in a series of them, the series converging in the L* metric (converging in 


the mean). 


312 


| 
i 
a 
t 
4 a 
| 
( 
fr 
| 


A COMPLETENESS THEOREM. 313 


CoroLLary. The set 
is complete in L?(—-, 7) tf and only if 
--G(r) £0, 0, 
except perhaps on a set of measure zero. 


This follows from the theorem if G(x) is replaced by e-'*G(az). The 
corollary, with G(a) =, corresponds to the original problem; the formula- 
tion with the set (1.3) is more suitable for generalization. 

The general problem of determining necessary and sufficient conditions 
for the completeness of 

{gimer, G(x) etm}, 


where {mx} and {n,} are mutually exclusive sequences containing all the 
integers between them, appears to be difficult; but the case in which {n;} is 
an arithmetic progression is easy, and not essentially different from Theorem 1 
(see Theorem 5). Another generalization, in which the set of Fourier func- 
tions is broken into more than two sequences, is discussed in § 4. 

By means of a theorem of R. E. A. C. Paley and N. Wiener, Theorems 1 
and 5 can be transformed into uniqueness theorems for entire functions of 
exponential type. Let War be the class of entire functions of exponential 
type* zr, belonging to L? on the real axis. I state only the theorem equivalent 
to Theorem 1. 


THEOREM 2. Let g(z) « Wa; let G(1) be the Fourter transform of g(2). 
A necessary and sufficient condition that every f(z) «Wa, satisfying 


(1.5) f(2n) =f +1—zx)dz = 0, 


is identically zero, is that 
(1.6) G(t+r)+G(1) £0, 
except perhaps on a set of meusure zero. 


A slightly less general theorem, with a considerable formal difference 
Irom Theorem 2, can be obtained hy application of a theorem of S. Bochner. 


3. Let A be a linear® operalor from L?(—-%,~%) fo 
‘The entire function f(z) is of exponential type ¢ (e > 0) if | f(z)| < Ae*|*|. 
°“Linear ” means “additive, homogeneous, and continuous.” 


6 


| 


314 R. P. BOAS, JR. 


«©, ©), permutable with differentiation.© A necessary and sufficient 
condition that every f(z) « Wx satisfying 
(1.7) f(2n) = A{f(2n + 1)} =9, (n=0,+1, + 2,- 


is identically zero, is that 


sin TU ity 7 


except perhaps on a set of measure zero." 
For comparison with Theorem 2 of the following note, I mention the 
following special case of Theorems 2 and 3. 


THEOREM 4. If f(z) «Wa, ¢ is a positive integer, and 


(1.9) f(2n) =f (2n) =0, (n=0,+1,+2,-: >), 
then f(z) =0. 

Here G(t) is equal to (tt)¢e** on (—~7,7), and vanishes outside 
(—~7, 7); A is defined on f(x) « Wx by the relation A{f(x)} (2—1). 


2. Proof of the completeness theorem. Let G(x) be a function of 
L?(— 2,7), having the Fourier series 


(2.1) G(x) ~ yerr, 


If B is a sequence of integers, I denote by Gr(x) the function whose 
Fourier series is the part of the Fourier series of G(x) with exponents in B; 
that is, 

(2. 2) ~ 
I shall prove the following theorem, which includes Theorem 1 as a special 
case (when N is the set of odd integers). 

TuHeoreM 5. Let N be an arithmetic progression with elements a+ kb, 
b=0 (k=0,+1,+2,-- +); and let B-be the set' of all integers kb 
(k=0,+1,+2,---). Let G(x) « L?(—-2,7). Then the set of functions 


* That is, when f and g are elements of L?(—-%,°%) such that g(a) =f'(«), We 
have 

7 We can state a theorem, similar to Theorem 3, but entirely equivalent to Theorem 
2, by introducing the space L* whose elements are functions f(x) which are Fourler 

co ! 
transforms of elements F(t) of L(—™©,%©), with the norm inl=f | F(t)| dt. 

Then Theorem 3 remains true if A is a linear operator from L?(—™,%™) to I, 
permutable with differentiation. 


Ty 


l 
| p=-00 
| 
( 
0 
(; 
al 
| al 


A COMPLETENESS THEOREM. 315 


(2. 3) G (a) einer N, Ni € N) 
is complete in L?(—-, tf and only if 
(2.4) 40, 


except perhaps on a set of measure zero. 


To establish the sufficiency of (2.4), we have to show that if it is 
satisfied, if « L?(— 7,7), and if 


(2.5) f F(x)eimedx — 0, 
and 
(2. 6) f F(x) G(x) eim=da = 0, 


then (a2) = 0 almost everywhere. 
Relation (2.5) shows that F(a) has a Fourier series of the form 
50 
(2.7) F(x) ~ 
k=-00 


Let G(x) have the Fourier series (2.1), and let G(x) be defined by (2.2). 
Then, from (2.6), we have 


se’B 


Since n+ sé’ N if nee N and sé’ B, the series on the right is zero, by (2.5). 
Thus we have 


f I(x) Gp(x) = 0, (k= 0, + 1, + 2,° 
so that F'(a) Gg(x) has a Fourier series of the form 
(2. 8) ~ Dd Byer. 
k=-00 


But the Fourier series of F(x) @s(a) can be obtained by formal multiplication 
of the Fourier series of F(a) and Gpg(x) ; consequently, by. (2.7), 


(2.9) ~ &e im, 
k=-00 


since —1b N for any integer J. 

Now (2.8) and (2.9) are in contradiction unless F'(z)Gs(r) =0 
almost everywhere ; since, by (2.4), Ge(x) is almost nowhere zero, (x) is 
almost everywhere zero. This completes the proof of the sufficiency of (2. 4). 


316 R. P. BOAS, JR. 


To establish the necessity of (2.4), we suppose that it is not satisfied. 
We may suppose that G(x) is periodic with period 27, and not almost every- 
where zero, since the set (2.3) is certainly not complete if G(x) =0 almost 
everywhere. Let H (of positive measure) be the set of zeros of G(x) in 
(— 7,7); let C(x) be the characteristic function of We have b~0 
(since if b =0, Gg(x) = yae*” and has no zeros); then Gg(a) has period 
2a/b, and hence C(x) has period 22/b. 

Now let F(z) =e (x). Then 


F(z) G(x) edz = f C(x) G(x) = 0, 


since =0 for all z. 
Thus (2.6) is satisfied. Also, since any integer m ‘which is not in NV 
has the form m =a-+ kb 4+-¢, 0<¢ < b, we have 


f F(x) = f C(x) 0), 


since e'” (0 <¢ <b) is orthogonal to every function of period 27/b. 

We have therefore constructed, if (2.4) is not satisfied, a function F(z) 
of L*, differing from zero on a set of positive measure, and satisfying (2.5) 
and (2.6). Hence (2.4) is a necessary condition for the completeness of 
the set (2.3). 


3. Deduction of Theorems 2, 3, 4. Consider two functions g(z), f(z) 
of Wz. By a theorem of Paley and Wiener,* 


(3. 2) g(2) = 


by Plancherel’s theorem, 


Thus if (1.5) is satisfied, we obtain 


(3. 4) f e?inth'(t) dt = 0, (n=0,+ 1,+2,:° ‘); 


*R. E. A. C. Paley and N. Wiener, Fourier Transforms in the Complex Domain, 
1934, p. 13. For another proof, see M. Plancherel and G. Pélya, “ Fonctions entiéres et 
intéerales de Fourier multiples,” Commentarii Mathematici Helvetici, vol. 9 (1936-31), 
pp. 224-248; pp. 228 ff. 


| 
i 
a 


A COMPLETENESS THEOREM. 317 


If (1.6) is also satisfied, we obtain, by Theorem 1, /’(t) =0 almost every- 
where, and consequently f(z) =0. 

On the other hand, if (1.6) fails, there is a function F'(¢) of L?, differ- 
ing from zero on a set of positive measure, and satisfying (3.4) and (3.5). 
The functions f(z) and g(z) defined by (3.1) and (3.2) then belong to We 
and satisfy (1.5). This completes the proof of Theorem 2. 

If G(t) == on (—z, the function defined by (3.3) is 2xf™(z), 
and Theorem 4 follows. 

We now consider Theorem 3. From a characterization of the operators A 
given by Bochner® it follows in particular that if f(z) is a function of Wz 
having the form (3.1), then 


(3. 6) A{f(z)} 


where G(/) is essentially bounded ; conversely, any essentially bounded G(1) 
defines, through (3.6), an operator A having the properties specified in 
Theorem 3. 

We write 


so that g(z) « Wz. From (3.6) and (3.3) we have 
A{f f(v)g(u— dz. 


Theorem 3 now follows from Theorem 2 if we show that (1.6) and (1.8) are 
equivalent for any given G(t). But we have 


9 gj 
? 

U 


2A =f (t) dt; 


00 S 

co 
G(t+7) + G(t) itu (g-imu +1) duu 

-&O 

ors) 

= A cog isudy, 

T -00 

Where ¢ + Jor, 


°S. Bochner, “Ein Satz iiber lineare Operationen,” Mathematische Zeitschrift, 
Vol. 29 (1929), pp. 737-743. 


318 R. P. BOAS, JR. 


In the same way we can establish the theorem stated in footnote 7. We 
need for this the result that any linear operator A from L?(— %, ©) to the 
space L* (defined in footnote 7), permutable with differentiation, has the form 


A{f(a)}— G(t)dt, G(t) «L?(— 0, 


where 
f(z) ~ e*tP(t)dt, F(t) eL*(— @, @). 


This theorem can be established by an appropriate modification of Bochner’s 
proof of his theorem cited in footnote 9. 


4, A generalization. It is natural to generalize Theorem 5 by breaking 
the set of Fourier functions into three or more sequences instead of only two. 
The results which can be obtained in this way are sufficiently indicated by a 
special case. Let F(x) belong to L* and have the Fourier series 


F(t) ~ > wei 


v=-00 


Let the functions whose Fourier series are the three sums on the right be 
respectively F,(x), F2(z). Then we have the following theorem. 


THeEorEM 6. If G(x) and H(x) belong to L?(—-,7), the set of 


functions 
(4. 1) esnia, G(x) ix, H (zx) ia 
(n=0, + 1, + 2,° 


is complete in L?(—-, 7) if and only if 


Go(x) Ho (x) — Gi(x) H2(x) £0, 


| 
IIA 
8 
lA 


= 
except perhaps on a set of measure zero. 


This can be proved in the same way as Theorem 5. 


DUKE UNIVERSITY. 


| 
| 

f 
{ 
| 
| 
f 
th 
0 
01 

ex 
th 

Pp 


of 


SOME UNIQUENESS THEOREMS FOR ENTIRE FUNCTIONS.* 
By R. P. Boas, 


1. Introduction. A theorem of Valiron * states that an entire function 
of exponential type * k < a, having a zero in each interval (n,n -+ 1) of the 
real axis (n= 0, + 1, + 2,: - -), is identically zero. If we let the zeros run 
together in pairs, and slightly weaken the hypothesis that the function is of 
type less than 7, we are led to conjecture the truth of the following theorem. 


THEOREM 1. /f f(z) ts an entire function of exponential type such that 


(1.1) f(iy) =O(ell), 
and 
(1.2) f(2n) = f’(2n) = 0, (n =0,+ * 


then f(z) =0. 


It is easy to prove Theorem 1 by considering the entire function 
{(z)esc? 4z, but this attack fails for the following more general theorem. 


THEOREM 2. If f(z) is an entire function of exponential type* satis- 
fying (1.1) and 


(1.3) f(x) =O(ell#l), > 
with 
1 
1. 
(qa positive integer), then f(z) =0 if 
(1.5) f(2n) == fOr) (2n) = 0, (n=0,+1,+2,- 


This theorem resembles Theorem 4 of the preceding note; * the function 
f(z) is now more general, but the condition (1.5) is more restrictive than 
the corresponding condition (1.9) of that note. We cannot use derivatives 
ofeven order in (1.5) as long asl > 0 in (1.8) ; this follows from Theorem 3, 
more directly from the examples f(z) = 


* Received November 10, 1939. 

*This note was begun while the author was a National Research Fellow. 

7G. Valiron, “Sur la formule d’interpolation de Lagrange,” Bulletin des Sciences 
Mathématiques (2), vol. 49 (1925), pp. 181-192, 203-224; 213. I am not quoting the 
most precise form of Valiron’s theorem. 

*The entire function f(z) is of exponential type ¢ (ec > 0) if | f(z)| < Ae*l*l. 

‘It would be enough to suppose that f(z) is of order less than 2; that f(z) is of 
‘xponential type would then follow from (1.1) and (1.3) by a Phragmén-Lindeléf 
theorem, 

vee completeness theorem,” American Journal of Mathematics, vol. 62 (1940), 
pp. 312-318, 


319 


e 
8 
8 
0. 
a 


. P, BOAS, JR. 


I shall establish a still more general theorem in which the (2q — 1)-th 
derivative in (1.5) is replaced by a linear differential operator; this theorem 
is somewhat similar to Theorem 3 of the preceding note. 

Let $(z) be regular in the rectangle | «|< L, | y| <7, so that 


(1.6) $(z) = 0,2’, |z| << min(Z,7z). 


Writing D for d/dz, we can form, for any entire function f(z) of exponential 
type c, the expression 


(1.7) f(z) = (z); 


the series is convergent (for all z) if ¢ << min(Z, 7), and (as will be shown 


below) summable by the method of Mittag-Leffler ® in any case. With these 
conventions concerning $(z), the following theorem holds. 


THEOREM 3. A necessary and sufficient condition that every entire func- 
tion f(z), of exponential type, satisfying 


(1.8) =O(el), mo, 

(1.9) f(a) =O(elltl), ao, 

and 

(1. 10) f(2n) = $(D)f(2n) = 0, (n= 0, 


should vanish identically, is that 

(1.11) $(z+ fir) £0, |x| |y| < de 
Theorem 2 is the special case where ¢(z) = 241. To see this, we observe 

that the zeros z, of (z+ dir)" — (z— hin)", if n is odd, are determined by 

the equation 


+ dtr = (2% — dir) (& = 0,1, 2,- - -,n—l1). 


Then we have 
lr 1 + ezkti/n 
1 — ezkri/n 


= cot(kr/n), 
so that the z are real and outside (— L, L) if 


if n=2m+1. 


® See, e.g., P. Dienes, The Taylor series, 1931, p. 311. 


320 
| 
| 
| 
| 


SOME UNIQUENESS THEOREMS FOR ENTIRE FUNCTIONS. 321 


2. Preliminary discussion. Let f(z) be of exponential type and satisfy 
(1.8) and (1.9). By a theorem of Pélya,’ we can write 


(2.1) f(z) (w) dw, 


where C is a curve containing the “ conjugate indicator-diagram ” of f(z) 

its interior, and /’(w) is regular on and outside C. By (1.8), (1.9), and 

the convexity of the indicator-diagram, we see that we may take as C any 

curve outside the rectangle |u| =1, |v | =k, where w=u-+ iv; a suitable 

choice is the rectangle |u| =I’, |v| =k’, where kK <b CL. 
Let the function @(z) of the theorem have the power series 


(2. 2) = az’. 
If ¢(z) is regular in a circle containing ( in its interior, we clearly have 


Cc c 


p-0 


wf (2) =9(D)f(2). 


p=0 


(2.3) 


If ¢(z) is not regular in a circle containing C in its interior, but is regular 
in the rectangle |u| < L, |v| <7, the integral on the left of (2.3) still 
exists for all z. It is clear that if the series Sa,w” is uniformly summable 
on ( by any linear summation method, the formal calculation in (2.3) will 
still be possible, and the result will be that the series Sa,f‘)(z) is summable 
by the same method, with the integral on the left of (2.3) as its sum. Now 
the power series of #(z) is uniformly summable by any Mittag-Leffler method 
in any closed subset of its Mittag-Leffler star,* which includes at least the 
rectangle |u|! L, | 1 
method defined by Lindeléf’s function * 


If we take, for definiteness, the summation 


2.4 
we shall have 


(w)o(w) dw = lim Sn (2) 
JC lim n=0 


where 


Si (2) = (2). 


p=0 


"G. Pélya, “Untersuchungen iiber Liicken und Singularitiiten von Potenzreihen,” 
Mathematische Zeitschrift, vol. 29 (1929), pp. 549-640; 580 ff. 
*P. Dienes, Lecons sur les singularités des fonctions analytiques, 1913, p. 113. 
*P. Dienes, op. cit., loc. cit. 


1 | 
| | 
| 
fe 
Zz Cru", 
n=0 
| 


322 R. P. BOAS, JR. 


We are therefore justified in writing 


(2.5) o(D)f(2) F (w)$(w) dw. 


3. Theorem 3: sufficiency. We first prove the sufficiency of our con- 
dition (1.11). In the first place, if (1.10) is satisfied, g(z) = f(z) csc4nz 
is an entire function, obviously satisfying 


(3. 1) g(iy) = O(e%4™ |y|—> ©. 

It is easy to see that 

(3. 2) g(x) = O(e'l#l), oo. 

In fact, we have,’° for each 6 in (0, 27), different from 0 or z, 
f (ret?) roo, 


and therefore 
g(re*?) O (ef |sin ) 


, 

Since g(z) is of order one, (2.7%) follows by the Phragmén-Lindeléf theorem 
for an angle," applied to g(z)e*™. 

We now define a function y(w), regular in |u| < L, | v | < 42, by the 
relation 
(3.3) (z) = (2 + dnt) — dat). 
Let C* be the rectangle |u| =I’, | v | =k’ — 4m; since g(z) satisfies (3. 1) 
and (3.2), we have 

g(2) = fe y(w) aw, 


where y(w) is regular on and outside C*. We define h(z) by 


(3. 4) =¥(D)g(2) = fey (w)y(w) aw, 


Evidently h(z) is an entire function of exponential type, satisfying 


h(iy) y | &, 


h(z) = O(e" ll), |x| — 


We are going to show that 


(3. 6) h(2n) = (—1)"o(D)f (2n) (n=0,+1,+2,°°'), 


10K. C. Titchmarsh, The Theory of Functions, 1932, p. 183. 
11 Or by the theorem cited in footnote 10. 


h 
| 
t] 
| ig 
tl 
W 
N 

| 
8] 
4 
ol 


SOME UNIQUENESS THEOREMS FOR ENTIRE FUNCTIONS. 323 


so that h(2n) —0, n=0,+1,+2,---+. Then, by Carlson’s theorem,” 
h(z) =0. But, if y(w) A0 in |u| <L,!v| <k (which is assumed in 
(1.11) ), the function o(w) = 1/p(w) is regular in the same region, and 


w(D)h(2) — fe r(w) dw 


From the representation of w(D)h(z) as a summable infinite series, it is clear 
that o(D)h(z) = 0 if h(z) =0. Thus g(z) = 0, and consequently 
f(z) = (2) sin rz = 0, which we were to prove. 

It remains to establish (3.6). In what follows, any infinite series, 


a Ay, 


n= 


is to be understood as a Mittag-Leffler sum as defined in § 2, i.e. as 


lim ii A, 
44 (a =0 n=0 


We have 


#(D)f(2n) arf (2n) 


then, since f(z) = g(z) sin 4zz, 


(3.1) (—1)"9(D)f(2n) = Sa > (2n), 
where 
om = (—1)"(sin |2-0n 


(—i)"] 


Now we have, uniformly for w + 4im in a closed subset of the star of $(w), 
and in particular for |u| SV, | v| Sk’ — 4x, 


$(w + dix) av(w + din)’ 
v=0 

and there is a similar expression for ¢(w— dir). Combining these expres- 
sions, we have (referring to (3. 3) ) 


=Sa> 


v=0 
the series being uniformly summable on C'*. Consequently we may substitute 
this expression for y(w) in (3.4) and integrate “termwise” along C*, 
obtaining 


*E. C, Titchmarsh, op. cit., p. 186. 


324 R. P. BOAS, JR. 


h(2n) = (ly oy_pg'™ (2n) 


y=0 
= (—1)"o(D)f(en) 
by (3.7). This is (3.6); the proof of the sufficiency part of Theorem 3 is 


thus complete. 


4. Theorem 3: necessity. Suppose that (1.11) is not satisfied. We 
have to construct an entire function of exponential type satisfying (1.8) 
and (1.9) (with some k <<a and 1< L), and (1.10), but not vanishing 
identically. 

Since (1.11) is not satisfied, there is at least one point wo, with | uw. | < L, 
| v | < 4x, such that 
(4. 1) + gir) — — Str) = 0. 

Let 
A(w) 
(w — wy — Fir) (w — wy + Fir) ’ 


F(w) = 


where A(w) is an entire function taking the value iz at w = wy + Siz; thus 
F(w) has residues + 1 and —1 at w= wy + Jim. 
We take numbers /, J, such that 


<l<L, lv tdr|<k<a; 
then the points wy + 43tm are inside the rectangle bounded by the curve C: 
If we set 
1 


f(z) =35 (w)dw, 
us Cc 


it is clear that f(z) satisfies (1.8) and (1.9). Moreover, we have, calculating 
residues, 


f(2n) = en F'(w) dw __ e2n( 0, 


(n=0,+1,+ 2,:°°); 


and 
1 
$(D)f(2n) (w)d(w)dw 
C 
= (— 1) + fin) — $ (wo — 
= 0, (n—=0,+1,+2,° °°); 
by (4.1). 


DUKE UNIVERSITY. 


fe 


Be 


A) 


| 

( 

t 
I 

h 
Zz 
i 
( 
is 
il 
0 
p | 
. 


ERGODIC CURVES AND THE ERGODIC FUNCTION.* 


By Richard KERSHNER. 


1. Introduction. Let WM denote a bounded subset of the Euclidean 
plane and let « > 0 be fixed. Then, following M. H. Martin? we have 


DEFINITION 1. A continuous curve 
(1) C: y=—y(t); OStS1; 


will be said to be «-ergodic to M (or to have the property (€) with respect 
to M*) if, for every point of M, there is a point of C at a distance Se. 
In general, an arbitrary set C satisfying this last condition will be said to 
have the property (e«) with respect to M. 


DEFINITION 2. A continuous rectifiable curve (1) will be called an 
eergodic curve for M if tt is e-ergodic to M and such that its length A(e) 1s 
an absolute minimum for the lengths of all continuous rectifiable curves 
eergodic to M. 


DEFINITION 3. The length A(e) of an e-ergodic curve for M, considered 
for varying «, is called the ergodic function for M. 


Martin has shown * that, for arbitrary JJ and « > 0, there is at least one 
(=(C(e) satisfying Definition 2, so that the function A(e) of Definition 3 
is well defined for all e >0. This function is clearly non-negative and non- 
increasing with e. Recently * Martin has shown it to be a continuous function 
of « He had previously pointed out® that A(e) > «© as e—0 unless M is 
a point set lying on a continuous rectifiable curve. In the last section of the 
present paper the description of the asymptotic behavior of A(e), for small e¢, 
is extended by showing that, for an arbitrary set /, 


lim 2eA(e) meas J, 


* Received March 16, 1939. 

7M. H. Martin, “ Ergodic curves,” American Journal of Mathematics, vol. 58 
(1936), pp. 727-734. 

*Cf. A. Errera, “Un Probléme de Géométrie Infinitésimale,” Académie Royale de 
Belgique Mémoires, vol. 12 (1932), p. 4. 

* Loe. cit., 1, tel: 

*M. H. Martin. “ Note on the continuity of the ergodic function,” Bulletin of the 
American Mathematical Society, vol. 43° (1937), pp. 541-546. 

Loc. cit., 1, 733. 


| 
| 
320 


326 RICHARD KERSHNER. 


where M is the closure of VM. It should be mentioned that this last section 
can be read independently of what precedes it. 

At the present stage it seems hopeless to expect the explicit determination 
of C(e), for all « > 0, for even the simplest sets M of positive plane measure, 
However it is possible to find considerable information about the nature of 
C'(e) both locally and in the large. The greater part of this paper is devoted 
to investigations of this nature. The main results are that C has no double 
points, has at every point a right and left hand tangent, has a well-defined 
tangent up to a countable number of corners and has no cusps. 


2. Preliminary lemma. ‘This section will be devoted to a general lemma 
on the parametrization of rectifiable curves which will be very useful in the 
sequel. This lemma may also be stated in such a way as to have, apparently, 
nothing to do with parametrization ; viz., 


LemMA 1. Any continuous rectifiable curve C may be uniformly approzi- 
mated by simple polygons, whose lengths approximate that of C. 


Proof. The proof of this Lemma 1 is routine and will simply be outlined. 
First one chooses a polygon B, which approximates C. Then, if B, has 
multiple points which are not simple isolated double points one performs a 
slight deformation so as to obtain a polygon B, approximating B, and such 
that all multiple points are simple isolated double points (and there are only 
a finite number of these). Now let the polygon B, be traced in a definite 
manner and suppose that at a given double point p the polygon B, actually 
crosses itself when traced in this manner. Then if the sense of tracing 1s 
reversed along that portion of B. which consists of a closed curve through p, 
B, will no longer cross itself at p. Then, evidently, the double point p may 
be “pulled apart ” without introducing any new double points. In this way 
the (finite number of) double points of B, may be removed and a simple 
polygon B, found which approximates B, and therefore C. 

A restatement of Lemma 1 which explicitly introduces the parametric 
representation (1) of C will also be convenient. First 


DEFINITION 4. The parametric representation (1) of C will be said to 
be non-crossing if the following condition is satisfied: Let t, < tz be the 
parameters of any double point of C and let T be any simple closed curve 
containing this double point which meets the four branches of C corresponding 
tot<h,t>th,t<t,t >t. Let pr, po, ps, ps be the four points of T whose 
parameters are, respectively, the greatest, least, greatest, least value of t salis- 
fying these four inequalities and giving points on T. Then py, pz do nol 
separate pz, p, on T. 


| 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 327 
Clearly, if C has no double points, then any parametric representation is 
non-crossing. On the other hand Lemma 1 shows that 


LEMMA 1bis. <Any continuous rectifiable curve C has a parametric repre- 
sentation (1) which is non-crossing. 


Proof. In fact one simply chooses a sequence of simple polygons con- 
verging to C even in length, parametrizes each polygon by its are length, and 
lets the limit of these parametrizations define a parametrization for C. Then 
it is very easily verified that Definition 4 is satisfied by this representation. 


AssuMPTION A. In the sequel vt will always be assumed that the para- 
metric representation 


(1) C: OStS1; 
satisfies Definition 4. 


3. Terminology and notation. Let C be an «-ergodic curve (1) for M. 
Then 


DEFINITION 5. By the point [t]* of C will be meant the point with 
coordinates x(t), y(t). 

DEFINITION 6. By the arc (s,u), (where s <u ts always understood) 
will be meant the set of all points [t] with s<it<u. If one or both of 
these parentheses is replaced by a square bracket, it is understood that equality 
is allowed at the corresponding end or ends of this last inequality. 


DEFINITION 7. The arc (s,u) of C will be called non-significant if the 
point set consisting of the two ares [0,s], [u,1] of C—(s,u) has the 
property («) with respect to M. In the contrary case (s,w) will be called 
significant. 

DEFINITION 8. A point [1] of C will be called non-significant if some 
arc (s,u), with s<t<u, is non-significant. In the contrary case,’ that 
every such (s, uw) is significant, the point [t] will be called significant. 


DEFINITION 9. A point of M will be said to be salient to (s,u) if it ts 
ata distance Se from some point of (s,w) and at a distance > from every 
pont of C—(s,u). Such a point will be denoted by p(s,u), and the 
totality of all such points by P(s,u). 

* The letters s, t, w will all be used as parameter values in the sequel but the generic 
parameter value will always be referred to as t. 

"Here, and throughout the paper, it is assumed that we are not dealing with an 
end point 0 or 1. This assumption is made simply for simplicity of statement and the 
definitions and results are readily extended to include the case of end points. 


| 
f 
d 
d | 
e 
d 


328 RICHARD KERSHNER. 


DEFINITION 10. A point of M will be said to be salient to [t] if it is 
at a distance « from [t] and at a distance =e from any point of C. Such a 
point will be denoted by p(t), and the totality of all such points by P(t). 


DEFINITION 11. The circle of radius « about a point p(t) will be called 
a salient circle through [t] and denoted by S(t). 


DEFINITION 12. A significant point [t] will be called simply significant 
if there is exactly one salient circle S(t) through [t]. 


DEFINITION 13. A significant point [t] will be called doubly significant 
if there are exactly two salient circles S;(t) and S2(t) through [1] and if 
these are mutually tangent. 


DEFINITION 14. A significant point [t] will be called multisignificant 
if there are at least two intersecting salient circles through [ft]. 


Notice that a salient circle S(¢) passes through [¢] but has no point of ( 
in its interior. It will be shown, in the next section, that there is at least 
one S(t) through every significant point [¢] so that Definitions 12, 13, 14 
provide a classification of all significant points. Notice also that, according 
to Definition 10, the set P(¢) of points salient to [¢] is a closed set lying on 
the boundary of the e-circle about [/]. In view of this one can make the 
following definition : 

DEFINITION 15. Let [t] be significant and let S, be the circle of radws 
e about [t]. Let Ae be a* closed arc of S_ of minimum length which contains 
the set P(1). Then the angular measure 0(t) of this arc is called the salient 
angle for [t], and its endpoints are called the terminal salient points p,(t), 


pe(t) for [t]. 
Notice that 


(2a) 6(t) =0, if [¢] is simply significant ; 
(2b) if [¢] is doubly significant ; 
(2c) 0 < Sz, if [¢] is multisignificant. 


The last of these relations comes from the fact that if 6(/1) > 7, then the set 
of circles S(t) would completely cover some neighborhood of [¢] so that [1] 
would be an isolated point of (, contradicting the continuity of C. It will be 
shown in the next section that the equality sign in (2c) cannot hold. 


*This are will not be unique if [¢t] is doubly significant but its length and end 
points are, of course, determined. 


| 
| 


it 


f 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 329 


DEFINITION 16. By K (pu, po) and p(pr, pz), where p, and peo are points, 
will be understood the linear segment joining p, and ps, and tts length, 
respectively. 

4, Fundamental relations. This section will be devoted to a study of 
some of the fundamental relations between the concepts introduced in the 
preceding section. 


LemMA 2. Suppose the point [t| of C is significant. Then there is at 
least one salient circle S(t) through {1}. 


Proof. Let (si, ui) be a sequence of arcs such that 


(3a) Ha < (1=—1,2,-- -) 
end 
(3b) lim s; = lim u; 

00 1-00 


Then, by Definition 8, (si, wi) is significant for all i. According to Definition 7 
this means that the set P; which is defined as the closure of the set of points 
P(s;,u;), is non-empty. In fact Definition 6 implies the existence of points 
of M which are at a distance >e from [0,s;] + [wi,1]. These points must 
be at a distance Se from (si, ui) since C has the property (€) with respect 
to M and consequently must be points of P(si, ui) by Definition 9. 

It is clear from Definition 9 that P(si, ui) © P (Sis, Wis1) in view of (8a). 
Thus 
(4) > Piss t= 1,2,---; P; not empty. 


By a well-known theorem on closed sets (4) implies that there is a closed, 
ron-empty set If = II(¢) such that 
(5) II(¢) = lim P Il P,. 
4=1 

Let po be any point of I(t). Then, by (5) and the definition of Pi, po 
is at a distance Se from (s;,ui) for all 7. Thus, by (3b), po is at a distance 
Sefrom [¢]. On the other hand py is at a distance =e from any point of 
(—(si,ui) for every 7, and so at a distance 2« from C. Thus, by Defini- 
tion 10, po is a salient point p(t) to [¢], and the e-circle about po is a salient 
circle through [/]. This completes the proof of Lemma 2. 

Incidentally the following fact, which will be useful in the sequel, has 


been demonstrated during the course of the last proof. 
Lemma 3. Let s<t<u. 
lim P(s,u) C P(t) 


where P(s,u) is the closure of P(s,u). 


8 
a 
a 
§ 


330 RICHARD KERSHNER. 


Actually this was proved above only in the case that [¢] was a significant 
point, but the lemma is vacuously true if [¢] is non-significant. 

Lemma 3 is quite restrictive in case [¢] is simply or doubly significant; 
i.e. if P(t) consists of one or of two points, but in case [¢] is multisignificant 
a stronger result is needed and will be given next. 


Lemma 4. Let [t] be a mullisignificant point of C and let p,(t), p(t) 
be the two terminal salient points to [t]. Let s<t<u. Then, if the 
notation is chosen appropriately 


lim P(s,t) C (p,(t)); lim P(t, w) C (p2(t)). 
st 
Proof. First, by Definition 9, 
P(s,t) + P(t,u) C P(s, u). 
Now, by Lemma 3, 


lim P(s,u) C P(t) 
8,u->t 
so that, a fortiori, 


(6) lim P(s, t) + lim P(t, u) C P(t). 
st 


It will now be shown that no point p(t) different from p,(¢), p2(¢) can 
be a point of the left side of (6). To this end let p(t) be a fixed point of 
P(t) which is not terminal. The point p,(¢) is then an interior point of the 
arc Ae mentioned in Definition 15. Let Sy be a circle with center at p;(1) 
and radius y, where 7 is so small that Sy does not cross either chord K(|[1], 
pi(t)), K([t], pe(t)). The circle Sy is divided by an arc of A¢ into two 
parts, one lying within and one outside S,. The points of Sy within or on 
are all at a distance Se from [¢] and so, according to Definition 9, cannot 
belong to P(s,t) + P(t,uw) for any s<t<u. On the other hand, the 
points of Sy outside S, are obviously at a distance >e« from any point of 
(s,u) for sufficiently small «—s in view of the fact that (s, wu) cannot enter 
either circle S,(t), S.(t) of radius « about p,(t), po(t), respectively. Thus 


no point of Sy is a point of P(s,t) + P(t, u) and ps(t) cannot be a point of 


lim P(s, t) + lim P(t, u). 
8st u->t 
Thus (6) may be strengthened to 
lim P(s, t) + lim P(t, uw) C (p.(t)) + (p2(t))- 
st ut 


The separation of this last relation into the two separate inclusions 
required by Lemma 4 is accomplished very easily by recalling the Assumption A 
stating that the two ares (s,¢) and (¢, uw) do not cross so that one is “ nearer 


| 
j 
i | 
: 
iq 
i 
i 
| 
i 
j | 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 331 


p(t) and the other “nearer” p.(t). The details of the separation will not 
be given as they are readily supplied. 

The next two lemmas are immediate consequences of the definitions of 
the concepts involved and are stated simply for reference. 


LemMMA 5. The set of significant points of C(e) form a closed subset 
of C(e). 
LeMMA 6. For any ft, 
lim P(u) C P(t). 
ut 
It should be mentioned that the limit here, as in Lemma 3 and Lemma 4, 
is the ordinary point set limit; i.e., the set of all limit points obtained using 
any sequence of u-values and any choice of particular points of P(u). Notice 
that actually P(w) = P(u) ; (i.e., P(w) is closed) but P(u) has been written 
for the sake of the analogy with Lemmas 3 and 4. 


Lemma 7%. For any to, 


lim sup 0(u) S 6(t,). 


u >to 


Proof. This is an easy consequence of Lemma 6. In fact if limsup#@(u) =0, 
u->t 


there is nothing to prove. Then suppose lim sup 6(w) =6@> 0 and let {ui}, 
ut 


i=1,2,---, be a sequence of ¢ values such that uj —¢) and 
(7) lim 0. 
4-00 


Now let p:(ui), p2(ui) be the terminal salient points for [ui]. Then the 
points p,(wi) are an infinite set in a bounded closed region M and have at 
least one cluster point p, in M. Let {tn,} be a subsequence of the {ui} 
such that 
(8) lim p;(Un,) = 

4-00 
Then, in view of Definition 15, the relations (7) and (8) imply that 
(9) lim p2(Un,) = 


exists; and, further, that p, and p. are two points on the ¢-circle about [t] 
which are separated by an angle 6 on this circle. Since (8) and (9) imply 
that p; and p. are points of P(t), in view of Lemma 6, the proof of Lemma 7% 
is complete. 

In general the inequality given by Lemma 7 cannot be replaced by 
equality; but the case when this is possible, namely when lim sup 6(u) = 7, 


ut 


deserves special mention in view of its later usefulness. In particular, 


| 
it 

| 

| 
) 
1, | 
0 
it 
e | 
r 
if 


332 RICHARD KERSHNER. 


Lemma 8. Let [t] be a limit point of doubly significant points. Then 
either [t] is doubly significant or mullisignificant with 6(t) =. 

As mentioned before, it will later be shown that the second alternative 
here cannot actually occur so this Lemma 8 will be strengthened. 


5. Local properties. This section will be devoted to establishing a few 
local properties of C which will be needed later. 

LeMMA 9. Let [to] be a non-significant point of C. Then if u—s is 
sufficiently small, s << to <u, the are (s,u) is linear. 

Proof. According to Definition 8, some (s, wv) is non-significant. Sup- 
pose this (s,w) is not linear. Then the curve obtained from C' by replacing 
the arc (s,u) by its chord K([s], [w]) is a shorter curve which, according to 
Definition 7, has the property (€) with respect to M. This contradicts the 
assumption that C was an e-ergodic curve for M. 

LEMMA 10, Let [to] be a simply significant point of C and let L(y) 
be the line tangent to the unique salient circle S(to) at [to]. Then if u—s 
is sufficiently small, s << ty <u, the are (s,u) lies in that, closed half plane 
determined by L(to) which does not contain p(to). 

Proof. Suppose the statement is false; i.e., that there is a sequence of 
points [¢;] lying in that open half plane determined by L(¢,) which contains 
p(to) and such that 4; > fo. For the sake of definiteness let it be supposed 
that ti < to. 

Let S* be the circle of radius de about p(to). Then, by Lemma 3, values 


So, Uo may be chosen so that 


P(&, Uo) < to Uo, 
and, a fortiori, 
(10) P(8, to) C 


Now about p(t)) draw a circle S(to,p) of radius « + p, where p > 0 is 
so small that S(¢o,) intersects both arcs (sy, fo) and (to, uo).° Let s; be the 
greatest t < ¢) and u, the least ¢ > ¢) such that [s,] and [uw] are on S(to, p). 
Then by the first paragraph of this proof there are points of (so, to) lying im 
that open half plane determined by L(/,) which contains p(/,). Thus some 
half line K(t)), terminated by [fo] and lying in this same half plane, meets 
in a point [to]. Clearly may be chosen in such a way 
that it does not meet S*. It is supposed that this has been done and also that 


* This is possible unless one of (8,, ¢,), (t,, U,) lies exactly along S(t,)- In this 


case the argument which follows may be modified by choosing the notation so that 
(8), t) lies along S(t,) and then choosing p = 0, u,=t, 3,= t*=s8,. 


( 
i 
i 
( 
i C 
0 
al 
| de 
H 
8, 
m 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 333 


is the first point distinct from where K (to) meets (That 
there is such a first point is clear from the fact that K (to), near [to] lies in 
S(t.) and so does not meet C.) 
Now, by (10), 
PU, — 


This means that the only points of M at a distance Se from (t*, to) which 
are not also at a distance Se from C — ((*,¢)) are points of S*. But, by 
Assumption A, the are (¢*,¢)), which does not cross K (to) by the definition 
of ¢*, cannot cross either of the two arcs comprising (5s, U) — (t*, to). 
This (1*,¢)) is separated from S* in the convex curve S(to,p) by the are 
A(s;,u,) defined by 


A (81, U1) = (81, + K([t*], + (to, 


Thus any point of S* at a distance Se from (/*,¢)) is also at a distance 
Se from A(s,,u;,). Then the curve 


is e-ergodic to M. Since this new curve is obviously shorter than C, this 


contradicts the assumption that C was an e-ergodic curve for M and completes 
the proof of Lemma 10. 


LemMA 11. Let [to] be a doubly significant point of C. Then if u—s 
is sufficiently small, s<to <u, the are (s,u) lies between two mutually 
tangent «-circles through [to]. 

Proof. This is trivial (the circles being the two salient circles Si (to), 
8,(4)) assumed in Definition 13) and has been included for reference only. 

LemMaA 12. Let [to] be a multisignificant point of C and let L,(to), 
In(t)) be the tangents to the two terminal salient circles S,(to), S2(to); 
respectwely, at [to]. Then if u—s is sufficiently small, s << ty <u, the are 
(s,u) is contained in that closed angle determined by L,(to), Le(to) which 


contains no interior point of Si(to) or 


Proof. The proof of Lemma 12 precisely parallels that of Lemma 10 and 


will not be given. It should be noticed that Lemma 4 serves here the purpose 
of establishing a relation like (10), where, of course, S* will be a Je-circle 
about pi(ty) or po(to) according to which arc is to be modified. 


Tn case O(t,) =7 this closed angle degenerates to a half line and is not uniquely 
determined by the condition that it contain no interior point of S,(¢,) or S,(t,). 
However, since [¢] is multisignificant, there must be, in this case, a third salient circle 
8,(t,) intersecting S,(¢,) and 8, (t,) (cf. definition 14). Then the half line is deter- 
mined hy the condition that it does not cut S,(t,)- 


334 RICHARD KERSHNER. 


6. Double points. This section will be devoted to the proof of 


THEOREM 1. Let M be an arbitrary plane point set. Let « > 0 be fixed, 


and let C = C(e) be an e-ergodic curve for M. Then C has no double points. 


Proof. Suppose the statement is false; i.e., that there is a 4; ~0 and 
a. tz 1," such that < while [t,;] Then there are a number of 
cases to be considered which are not mutually exclusive but which together 
exhaust all possibilities. 


Case 1. The points s, < ty < Uy, 82 < te < Us, can be chosen so that the 
four arcs (S2, te), (te, U2) coincide (as point sets) in two 
identical pairs. Suppose that these four arcs have been extended so that they 
are as long as possible satisfying the required condition of coinciding in two 
pairs and such that each coincident pair have coincident end points. Then 
there are the following possibilities (not mutually exclusive). 


Case 1.1: s,=0. Then the curve obtained from C by deleting the are 
[s:,¢:) —[0,¢,) is “shorter” in the parametric sense but identical in a point 
set sense with C. This contradicts the assumption that C was an e-ergodic 
curve for M. 


Case 1.2: uwe=1. A contradiction is reached, as in Case 1.1, by 
deleting U2] = (te, 1]. 

Case 1.3: 8,540; U2A1; —[u2]. In this case [s,] = [u2] is a 
double point which does not come under Case 1. For if [s,] = [wu] came 
under Case 1, then the given arcs [s,,¢,:] = [t2,u2] could be extended to 
longer point set identical arcs with coincident end points, contradicting the 
assumption that the given arcs were the longest such. Thus to exclude Case 
1.3 it will be sufficient to show there are no double points which are not in 
Case 1. 

Case 1.4: 8,0; [s,:] —[s.]. This is treated exactly as is Case 1.3. 

Case 1.5: 8,340; [s:] —[u.]. This is treated exactly as is Case 1.3. 


The above five possibilities exhaust Case 1 in view of the assumption 
that the identical arcs had identical end points. 


Case 2. The point < 8 can be chosen so that 
some three of the four arcs (s;,t;), (t1,U), (S25 t2), (to, U2) are identical (as 
point sets). This case can be treated in essentially the same way as Case | 
and the detailed treatment will not be given. Either a contradiction is reached 


11 Cf. footnote 7. Of course t,=0andt,=1 is not considered as a double point 
but merely means that C is closed, which is trivially seen to be possible. 


tw 


| 
| 
i 
W 
| 
| [ 
[ 
| eV 
| or 
| to 
fo 
re 
th 
m 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 335 


or one is led to the existence of a double point which is not in Case 1 or Case 2. 
Thus it will be sufficient to prove the impossibility of this last. 


Case 3. There are no salient circles through [t,] =[t.]. Then, by 
Lemma 2, both [¢,] and are non-significant; i.e., some ares 
U2), With << < WU, 8 < ts < Ue are non-significant. Then, by Lemma 
9, these two arcs are linear. But by Assumption A these two segments through 
= [t2] are non-crossing. This is possible only if [¢,] = is in Case 1. 


Case 4. There is exactly one salient circle S through [t,] = [tz]. Let 
p be the center of S and ZL the tangent line to S at [t:] =[#.]. Now let 
$i, Ui, 1 = 1, 2, be chosen so that 


(11a) 
(11b) P(si, ui) CS, (i=1,2); 
(11c) (Si, Ui) (t=—1,2); 


where H denotes that closed half plane determined by L which does not con- 
tain p. The possibility of satisfying (11b) is assured by Lemma 3 while 
(1lc) may be satisfied in, view of Lemma 10 if [¢;] is simply significant and 
in view of Lemma 9 if [¢;] is non-significant. 

Now let a circle S(p) be drawn with center at p and with radius ¢ + p 
where p > 0 is so small that S(p) intersects all four arcs, (si, ti), (ti, wi), 
i=1,2. That such ap > 0 exists is clear from (llc). Finally, let 
respectively, be the greatest ¢ < ¢t; and the least ¢ > ¢; such that [s’;] and 
are on S(p), 1, 2. 

Consider the are A(p) of S(p) which lies in H. The four points [s’i], 
[wi] lie on A(p) in some linear order. (It is not excluded that certain, or 
even all, of these points coincide; in which case there will be a corresponding 
ambiguity in this linear order.) It will be unimportant in which sense this 
order is established so that the twenty-four permutations on four letters reduce 
to twelve cases that will be considered distinct. Of these twelve possibilities, 
four are eliminated, in view of Definition 4, by Assumption A. Then the 
remaining eight possibilities may be reduced, by making use of the fact that 
the above notation may be changed, by reversing the direction of the para- 
metrization along C’, to one of the following two types: 


Case 4.1. The linear order is [s’;], {[s’2], [w’2]}, 
Case 4.2. The linear order is [ws ]}, {[s’2], [v’2]}- 


Here the curly brackets signify that it is of no consequence which of the 
two symbols contained occurs first; i.e., means either py, p2 Or Po, 


— 


336 RICHARD KERSHNER. 


In Case 4.1 the are [s’;, w’,] separates, in a weak sense, the arc [s’., u’,] 
from S in S(p).’* In particular, every point of S is as close to some point 
of [s’,, u’;] as to any point of [s’,, u’,]. Thus there are no points of P(s’», u’.) 
in S. In view of (11b) this means that P(s’2, u’2) is empty; i. e., that (s’s, u’2) 
is non-significant. Then by Lemma 9, (s’2, u’,) is linear. But, by (11c) and 
the fact that [s’,, u’,] separates [s’,, u’,] from S, this implies that also [s’,, u’,] 
is linear and these two ares [s’i, u’;] are both segments of ZL. Thus Case 4. 1 
reduces to Case 1. 

Case 4.2 is somewhat more troublesome. Let it be assumed, for the sake 
of definiteness, that the order is actually [9], [w’,], [s’2], [u’2]. It will be 
clear that this is no restriction since the proof will not refer to the nature of 
C outside S(p).1* Now let ZL be chosen as the Y-axis of a Cartesian (X, Y) 
plane with origin at the point [¢,] = [¢.]. Then either [s’;] and [w’,] are 
both in the closed upper half plane or [s’2], [w’2] are both in the closed lower 
half plane (provided the positive Y direction is chosen appropriately). It will 
be a notational assumption that the first of these alternatives is true. It will 
also be assumed that not both [s’,] and [w’,] are on the X-axis.’4 

Now consider the ellipse with foci at [s’;] and [w’,] and passing through 
[t,;] —[t.]. It is easily seen that this ellipse has, at the point [¢,] = [t], 
a negative slope; i.e., that a point p* may be chosen on S, in the open second 
quadrant and in this ellipse. It is supposed that p* is chosen so that the 
principal are S(p*,[t,]) of S, joining p* and [4,] has an angular measure 
<4. The fact that p* is in the given ellipse means that 

p*) + p(p*, [ws]) < [4)) + 4], 

(cf. Definition 16). Now let ¢* be the greatest ¢ < ¢t, such that [/*| is on 
the chord K (p*, [s’,]) joining p* and [s’;]. Then it is immediately seen that 
Thus it is seen that the curve C* which results from C by replacing the are 

(1*,u’,) by the two chords K([t*], p*) + K(p*, [w’1]) is shorter than C. 

It will now be shown that this C* has the property (€) with respect to 

M so that C is not an e-ergodic curve for M. This contradiction will complete 


12Tn case [s’,] = [s’,], J= [u’,] or [s’, [u’,], [s’,] = [u’ this state- 
ment may be understood as a notational assumption, in view of assumption A. 

13 It is easily seen, by an argument similar to that used in Case 4.1, that the ares 
(t,, u’,) and (8’,, t,) are non-significant and therefore linear, but this fact will not 
be needed. 

14 This is clearly justified unless all four points s’,, uw’, coincide on the X-axis 
If this occurs one simply chooses a smaller value of p in defining S(p). If the same 
trouble occurs for all small p > 0, then [t,] — [t,] is actually in Case | which has 


been treated. 


| 


it 


e 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 337 


the elimination of Case 4.2. To this end note first that the are (t*, u’,) is 
separated from S in S(p) by the are A(s’;, u’2) defined by 


t's) = *] + K([t*], p*) + S(p*, + [te 

(where S(p*, [¢;]) is the principal arc of S joining p* and [¢,] as mentioned 
above). This is true since (¢*,w’,) cannot cross [s’1,¢*] or [te, u’2] by 
Assumption A, K([{7¢*], p*), by the definition of ¢*, or S(p*, [4:]) by (11c). 

Now suppose C* has not the property (€) with respect to M. Then some 
point p of M at a distance Se from C would be at a distance > from C*. 
Since C* contains all of C save (¢*,u,), this point p, in particular, would 
have to be salient to (¢*,w’,;). Then, by (11b), pC S. Then the e-circle 
about the point pC S has an interior or boundary point on (t*, u’,) but 
does not cross C* and in particular does not cross A (s’;, u’2) — S(p*, [t1]). 
Thus, according to the preceding paragraph, this e-circle about p must cross 
S(p*,[t,]) twice. But this is impossible for p CS since S(p*, [t,]) has 
been assumed to have an angular measure < 4x. According to the preceding 
paragraph the treatment of Case 4. 2 (and consequently of Case 4) is complete. 


Case 5. There are at least two intersecting salient circles through 
[t;] = [¢.]. In this case clearly both and are multisignificant and 
6(t;) =O(t.). Here two cases are distinguished of which one is trivial. 

Case 5.1: 6(t,) =6(t.) =a. This case reduces immediately to Case 1 
in view of Lemma 12 (cf. particularly footnote 9). 

Case 5.2: 6(t,) =O(t2) <a. The treatment of Case 5. 2 closely paral- 
lels that of Case 4 and will not be given. It is enough to note that Lemma 4 
serves here the purpose served by Lemma 3 in Case 4 while Lemma 12 replaces 
Lemma 10. Of course, in view of Lemma 4, the four ares (si, ti), (ti, us) 
must always be considered separately, but the only difficulty introduced by 
this is a notational one. 

Case 6. There are exactly two, mutually tangent, salient circles through 
[4:] =[¢.]. Let p., po be the centers of these two salient circles S,, S2 and 
let S*,, S*, be the circles of radius Je about p,, po, respectively. Let si, wi, 


i=1,2, be chosen so that 
(12b) P(5,, + P(s2, U2) C + 


as is possible by Lemma 3. 
Let Si(pi), i= 1,2, be a circle of center pi and radius pi where 
pi >0 is so small that Si(pi) meets all four arcs (51,41), (41, U1), te), 


nt 
2) 
id 
1] 

re 
Il 
I] 
h 
( 
a 
n | 
0 

| 


338 RICHARD KERSHNER. 


(t2,U2).2° Let sij,uij, respectively, be the greatest ¢ < ¢i, and the least 
> ti, such that and [wij] are on S;(pj) 7 = 1,2; i=1,2. Finally, let 
= max Si;, = min uij. 

j=1,2 g=1,2 
Now let the points p;, p2, respectively, be the points (0,«), (0,—e) of a 
Cartesian (X,Y) plane. Suppose that the point [t:] —[t.] is not in Case 1 
or Case 2. Then there are two cases to be distinguished. 


Case 6.1. Two of the ares (si, ti), (ti, wi) lie in the right half plane 
and two in the left half plane. Since |[t,] = [t.] is not in Case 1, these four 
arcs do not all lie on the \-axis. Choose the quadrant which contains a point 
of one of these arcs as the first quadrant by making the appropriate changes 
of notation. Let K be a half line terminated by [t,] = [2] and lying in the 
first quadrant. Then, clearly, K can be chosen so near the positive Y-axis 
that it does not meet S*, or S*. but does meet one of the. four ares (s‘i, ti), 
(ti,u’;). Let ¢* be the first point [t,] where K meets one of these four 
arcs. (That there is such a first point is clear from the fact that near [1;], 
K lies in one of the salient circles S,,. 8. and so does not meet (.) For the 
sake of definiteness let it be assumed that s’; < t* < ¢, and that [s’2, 2] is 
the other arc in the right half plane. It is clear from the definition of 1%, 
together with Assumption A, that [s’s, 2] is “below” [s’;, ¢,]. Assume also 
that S*, is in the upper half plane. 

Then there are no points of S*, salient to (t*,t,). In fact, the entire 
are is separated by (S22, from S*, in the convex curve consisting 
of that part of S2(pz) lying below and to the right of that tangent line to S*, 
through [t,] = [t.] which has positive slope. Thus, by (12b), all points 
P(t*, to) lie in 

Now it may be shown, by an argument exactly like that used in the proof 
of Lemma 10, that the curve obtained from C by replacing the arc (¢*, t:) 
by the chord K([{t*], [t:]) is a shorter curve with the property (e). This 
contradiction completes the treatment of Case. 6. 1. 


Case 6.2. At least three of the arcs (si, ti), (ti, wi) le in the same 
half plane determined by the Y-axis. In this case choose the half plane which 
contains at least three of these ares as the right half plane. Since [¢:] = [#2] 
is not in Case 2, these three arcs do not lie on the X-axis. Choose the quadrant 
which contains a point of one of these arcs as the first quadrant. Then all 


15 Again this is possible except in the case that some one of these four ares lies 
on the boundary of S;(0) =S;. The necessary modification in this case is trivial. 


! | 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 339 


the essential features of Case 6.1 are obtained and the treatment proceeds in 
the same way by simply neglecting the extraneous arc or arcs. 

This completes the proof of Theorem 1. 

7. Corollaries. Theorem 1 allows several previous results to be strength- 
ened. In connection with the inequality (2c) for 6(t) we have 


LEMMA 13. 0(¢) =~7 if and only if [t] is doubly significant. 


Proof. If 6(t) = for a multisignificant point [¢], then, by Lemma 12 
(cf. footnote 9), C has a double point. But this is impossible by Theorem 1. 
Then Lemma 8 can be stated more simply as 


LemMA 8bis. The set of doubly significant points of C is closed. 


8. Local properties resumed. In this section the discussion of the 
nature of C' in the neighborhood of the various types of points will be resumed 
and further results obtained which are more conveniently proved with the help 
of Theorem 1. The first of these results is complementary to Lemma 11 and 
supplementary to Lemmas 10 and 12. 


LemMaA 14. Let [to] be not doubly significant. Then some are (s, u) 
with s << ty <u is convex toward P(to).'® 

Proof. If [to] is non-significant, this is so by Lemma 9. 

Suppose first, then, that [/)] is simply significant and let s,,u be 
chosen so that 


(13b) P(s,,u,) C S(to). 


The possibility of satisfying (13b) follows from Lemma 6. From (13b) it 
follows, 4 fortiori, that 


(14) 


Now let S(p) denote a circle of radius « + p about p(to), where p > 0 
is so small that S(p) cuts both ares (5,, to), (40, ui) and let s, u, respectively, 
be the greatest £ < ty and the least ¢ > t) such that [s] and [w] are on S(p). 
(The existence of such a p > 0 is obvious from Lemma 10.) Let A(s, u) 
he the principal are of S(p) joining [s] and [w]. 

Then the closed curve T defined by 


lr = [s,u] + A(s, u) 


; “An are is said to be convex if it can be made an are of a convex curve. A non 
linear are is said to be convex toward a given set if the given set necessarily lies outside 
aly such convex curve. A linear segment is said to be convex toward a given set if the 
set lies on one side of the linear extension of the segment. 


t 
1 | 
| 


340 RICHARD KERSHNER. 


is simple. In fact [s,u] cannot touch itself by Theorem 1 and cannot touch 
A(s,u) (except at [s] and [w]) by the definition of s and u. It will now 
be shown that the simple closed curve [ is convex. Suppose this is not the 
case ; 1. e., that some chord K ([t,], [¢2]) lies outside T, where st, < tu Sy 
Then the are consisting of 
separates (¢,,¢2) from S(to) in S(p). But, by a now familiar argument, 
this contradicts (14). The fact that the convex are (s,w) is convex toward 
p(to) is obvious from Lemma 10. 
The only case remaining to be considered is that [/o] is multisignificant. 
In this case (13a) is replaced by a choice of s;, u, such that 
Ss <b 
P(s;, to) S,(to), P(t, U1) S2(to) 


(where S;(¢o), S2(to) are, of course, the terminal salient circles through [¢o]), 
by using Lemma 4. Then a double repetition of the preceding argument, 
somewhat modified of course, shows that some (s, fo) is convex toward p, (to) 
and some (fo, wv) is convex toward p.(t)). These facts, together with Lemma 
12, are easily seen to imply the convexity of (s,w) toward P(t). 

It should be remarked that this Lemma 14 is not simply a consequence 
of the results of Lemma 8 bis, Lemma 10, Theorem 1 (as might be suspected) 
in view of the possible existence of linear segments converging to [f)]. In 
fact it is quite easy to construct an example of a simple curve with the local 
half plane or “local supporting line” property of Lemma 10 at every point 
but which is not locally convex at some point. However, this can only be done 
by the introduction of linear segments. The complementary Lemmas 14 and 


11 lead immediately to 


THeoreM 2. Let M be an arbitrary bounded plane point set. Let «>9% 
be fixed and let C be an e-ergodic curve for M. Then at any point [t] of C, 
0<t <i, there is a right and left hand tangent to C. 

Proof. If [to] is not doubly significant, then, in view of Lemma 14, 
this follows from a well known theorem on convex curves. If [f)] is doubly 
significant, this is trivial in view of Lemma 11. 

The difficulty (of linear segments) mentioned above in connection with 
the extension from the local supporting line to local convexity also prevents 
the extension from local convexity to convexity in the large. In the case at 
hand, however, the extension from local convexity to a kind of convexity in 
the large can be made without eliminating all linear segments—it is enough 
to eliminate those linear segments which are non-significant. 


( 
( 
t 
t 
t 
t 
] 
§ 
1 
( 
4 

( 
| 
h 
( 
( 
I 
V 
i 

| | 

| | 


ich 


ERGODIC CURVES AND THE ERGODIC FUNCTION. °* 341 


LeMMA 15. Let (s,u) be an are of C containing no non-significant or 
doubly significant points. Then (s,u) consists of a finite number of convex 


arcs. 


Proof. Let ¢(t) be the inclination which the left hand tangent to C at 
[{] has with the z-axis. In view of Lemma 14 the indetermination (+ 2k7) 
of d(¢) may be determined so that #(/) is monotone (in the weak sense) in 
some neighborhood of each ¢ for which [¢] is not doubly significant. At the 
same time #(¢) may be supposed bounded. ‘To see this it is enough to notice 
that C cannot spirally converge to a point (remembering that C is simple and 
of finite length). But this is easily seen since if spiral convergence occurred 
the points of C near the limit point would be non-significant and hence C 
would be linear near this point which is a contradiction. It is now supposed 
that some well determined ¢(¢) is chosen which is bounded and weakly mono- 
tone in some neighborhood of every ¢ where [/] is not doubly significant. 

Now let (s,w) contain no non-significant or doubly significant points so 
that (1) is weakly monotone in some neighborhood of every ¢ for s<t<u. 
It will be shown now that (¢) is weakly monotone in the entire interval 
s<t<u. For suppose this is not so. Then there is some sub-interval 
of s<t<w such that the point where ¢(¢) attains its 
maximum ‘7 (or its minimum) over the interval s; = /S wu, is an interior 
point s, << ¢; < u,. For the sake of definiteness consider the case of a maxi- 


mum; i.e., suppose that 
(15b) 2 ¢(t), 


But (15b) contradicts the local monotony of ¢(/) unless the equality sign 
holds in (15b) near ¢,; at least on one side of ¢,. Thus there are values s2, Uz 
such that 18 

(16a) <8 < th, Ss 


(16b) $(t,) = ¢(t), 


Without loss of generality, let it be assumed that the segment (sz, Wz) 
=K([s2],[us]) lies along the a-axis and that $(4:) =0. By the local 
monotony of (1) at s. and us, in connection with (16b) and footnote 18, 


Values $,, Ws may be chosen so that s; < S2, U2 < Us and 


‘The fact that ¢(t) actually attains its maximum (minimum) for any such closed 
interval is a trivial consequence of the local monotony of $(t). 

“It is supposed s,, w, are chosen so that w,-s, is as large as possible. The equation 


of (16b) may or may not hold with t = 8, or t= wu,. 


he 
If, 
rd 
it. 
), 
) 
a 
| 


342 RICHARD KERSHNER. 


(17a) <0, <t< se; 

(17b) o(t) < 0, tle Kt < ty. 

On the other hand, by Lemma 14, these values s3, uw; may also be chosen so that 
(18a) (sz, 2) 1s convex toward P(s2) ; 

(18b) (uz, U;) is convex toward 


Now, by assumption, all points [/] for s.<¢< uz are either simply 
significant or multisignificant. But this last alternative is seen to be impossible 
if Lemma 12 is compared with (16b). Thus 


P(t) = (p(t)), 


The set of all points {p(t)}, for s. << t < us, is, in view. of Lemma 6 and 
Definitions 10 and 12, a linear segment parallel to (s2, wz) and lying « units 
above or € units below (s2,u2). By comparing (17a) and (18a), it is seen, 
in view of Lemma 6, that this {p(t)} lies below (ss, u2). However (17b) 
and (18b) show, in the same way, that {p(t)} lies above (sz, u2). This con- 
tradiction completes the proof of the fact that ¢(¢) is weakly monotone in 
the entire interval s<t < u. 

To complete the proof of Lemma 15, it is enough to divide (s,w) into 
sub-ares such that the variation of ¢(/) over any sub-arec is <7. By the 
monotony and boundedness of ¢(/) the number of such sub-arcs is finite. 
That each such sub-are is convex is then trivial. 

As usual a point where the right and left hand tangents differ will be 
called a corner. In particular, every multisignificant point is a corner. On 
the other hand a non-significant point or a doubly significant point cannot be 
a corner. A simply significant point may or may not be a corner as trivial 
examples show. With regard to corners, in addition to these remarks, there 
will now be shown: 


THEOREM 3. With the notations of Theorem 2, there are only a countable 
number of corners on C. 


Proof. Let Qn, Qa denote, respectively, the set of non-significant, doubly 
significant points of C. The set Q» is, by Lemma 5, an open subset of ( s0 
that Yn is of the form 


= (si, Ui). 
4=1 


But, by Lemma 9, no point of Qn is a corner. Thus the set 


— Dd 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 343 


co 
Qn =% [si, ui] 


contains at most a countable number of corners. 

Now the set Qa, which is closed by Lemma 8 bis, can contain no corners 
in view of Lemma 11. Thus to prove this Theorem 3, it is enough to show 
that there are at most a countable number of corners among the points of 


—0— FO. 


But this Q is an open subset of C; i.e., 
= 2 Us). 
i=1 
Also (s’;, w’;) consists of a finite number of convex arcs, by Lemma 15. Thus 
» DY 
Y= ui’) 
i=1 


where (s;”, ui”) is convex. But it is a standard theorem that a convex arc 
contains only a countable number of corners. This completes the proof of 
Theorem 3. 

As usual, by a cusp will be meant a point where there is a well defined 
tangent line but where the curve does not cross the normal. With regard to 
these it is easy to prove 

THEOREM 4, Wuth the notations of Theorem 2, the curve C has no cusps. 


Proof. In view of Lemma 14, a point [¢] which is not doubly significant 
cannot be a cusp of C, since a convex curve can have no cusps. Thus it is 
sufficient to show that a doubly significant point cannot be a cusp. This can 
easily be done by an argument exactly like that which showed that 0(1) = 
was impossible for a multisignificant point. 


9. The ergodic function. In this section, contrasting with the pre- 
ceding, C’ will be used to denote any continuous rectifiable plane Jordan curve 
of length LZ while an e-ergodic curve for M will be denoted by C(e) and its 
length by A(e). 

Let D(C';«) denote the set of all points in the plane at a distance Se 
from some point of C. Thus D(C;«) is the domain swept out by a circle of 
radius « whose center traverses C. Then D(C;e) is measurable (in fact 
closed) and 

meas D(C ;€) S 2eL + ze’. 


This inequality is given by Errera’® for simple Jordan arcs, but actually, in 


* Loc. cit., 2. 


ly 
le 
ts 
1, 

) 
l- 
4 
0 
| 
. 


344 RICHARD KERSHNER. 


his proof, no use is made of the fact that the arc is simple. In fact the 
inequality is strengthened if C is not simple. 

The fact that C has the property (€) with respect to M is clearly 
expressed by 
(19) D(C;«) OM. 


But since D(C;«) is closed, (19) may be strengthened to 
(20) D(C;3«) DM 
where M is the closure of M. Then by (19) and (20) and the fact that 
L=A(e) for C = C(e) 
(21) meas M = 2eA(e) + ze’. 
Now, immediately, we have 
LeMMA 16. For an arbitrary bounded set M 


lim inf 2eA(e) = meas M. 


Next it is to be shown that lim sup 2eA(e) = meas M. In this direction 
we prove first 

LemMA 17. Let R bea rectangle of perimeter P. Then 

2ReA(e) = meas + 2Pe + 16e?. 

Proof. Let the rectangle be placed on a Cartesian codrdinate system with 
its vertices at (0,0), (a,0), (0,0), (a,b). Then let a curve (C=C; be 
traced in the following way: First trace the horizontal segment (0,0) to 
(a, 0), then the semicircle of radius « which has the segment (a, 0) to (4, 
as diameter and which lies outside #, then the segment (a, 2) to (0, 2e), 
then the semicircle of radius « which has the segment (0, 2€) to (0, 4e) as 
diameter and which lies outside f#, then the segment (0,4¢) to (a, 4), etc. 
Continue in this manner until a horizontal segment has been drawn which lies 
above R. It may be easily verified that for the curve C, so constructed, the 


inequality (21) becomes an equality; i.e., we have 


(22) meas D(C,.3€) = 2eL(e) + 
where L(e) is the length of C,. It is equally obvious that 
(23) D(C,;<) C 


where R* is the rectangle with vertices (— (a+ 2e,—%e), 
(— 2c, b + (a+ +2). The first inclusion (23) shows that 
is e-ergodic to D so that 

(24) A(e) S L(e) 


and the second inclusion (23) gives 


| | 
| 
| 
| 


ERGODIC CURVES AND THE ERGODIC FUNCTION. 345 


(25) meas meas R + 2Pe + 


Combination of (22), (24), and (25) gives the inequality of Lemma 17. 
We are now ready to demonstrate 


LemMA 18. For an arbitrary bounded set M 


lim 2eA(e) S meas M. 


Proof. et 7 > 0 be chosen arbitrarily. Then, since M is closed, there 


exists a finite set of rectangles, Ri, R.,- - -, Rn, such that 
(26) sR, ODM 
and 
(27) meas meas M + 7». 
é=1 


In each of the rectangles Ry consider an e-ergodic curve Ci = Ci(e) of length 
Ai(e). Let Cy be oriented so that it is possible to speak of the beginning 
point of C; and the end point of C;. Finally consider the curve C=C, 
consisting of the n curves C; and the n — 1 linear segments joining the end 
point of C; to the beginning point of Ci.(i—1,2,---,n—1). Then C 
is clearly e-ergodic to = R; and, by (26), 4 fortiori, to M. Thus 

4=1 
(28) A(e) S L(e) 
where A(e) is the ergodic function for M and L(e) is the length of C.. On 
the other hand 


(29) L(e) Ai(e) + (n—1)D 
4=1 


where D is the maximum distance between two points of 3 Rj. Applying 
4=1 


Lemma 17 to each A; we have from (29) 
(30) 2eL(e) meas 3 Ri + Pi + 16ne* + 2(n— 1) eD 
4=1 4=1 


where P; is the perimeter of R;. Combining (27), (28), and (30) we have 
at once 
(31) lim sup 2eA(e) S meas M + 7». 


Since (31) holds for an arbitrary 7 > 0, the proof of Lemma 18 is complete. 
Lemma 16 and Lemma 18 together imply 


THEOREM 5. For an arbitrary bounded set M 
lim 2eA(e) meas M. 
€—>0 

THE UNIVERSITY OF WISCONSIN. 


§ 


n 
| 


REMARKS ON A SPECIAL CLASS OF ALGEBRAS.* 


By O. F. G. SCHILLING. 


It was shown by Hasse and Witt that the structure of normal simple 
algebras over algebraic numberfields and certain fields of algebraic functions 
can be described in terms of the arithmetic of the underlying groundfield. 
In this note we discuss algebras over function fields of one variable whose 
coefficient fields are fields which have only cyclic extensions. It turns out 
that quite a few of the results of the afore-mentioned theories are still true 
under our assumptions, e.g. the theorem concerning the sum of the local 
invariants of an algebra. However, the step from the theory of algebras to 
class field theory can no more be made. Our results throw some light on the 
axiomatic treatment of the class field theory in the large. They clearly 
indicate that the validity of the norm theorem does not imply the law of 
reciprocity. The reason for this deviation from the classical theory can be 
found in the fact that the Takagi group of a cyclic extension is in general 
a proper subgroup of a suitably defined Artin group. 

1. Structure of the groundfield. Let 7 be a field which has only 
cyclic extensions. We shall suppose that for every integer n there exists at 
least one cyclic extension 7’, of degree n over T. The Galois theory then 
immediately implies that the extensions 7’, are unique, i.e. for every integer 
n there exists exactly one field 7. We now want to investigate the structure 
of the field 7’. 

Lemma 1. The field T is either an absolutely algebraic field of char- 
acteristic y~ «» whose Steinilz number has no infinite component or it 1s 
relatively complete with respect to a non-trivial valuation V. 


Proof. We distinguish two cases 
i) TY admits no valuation but the trivial one, 


ii) J has non-trivial valuations V. 


* Received August 21, 1939. 

*H. Hasse, “ Theorie der relativ-zyklischen algebraischen Funktionenkérper, ins- 
besondere bei endlichem Konstantenkérper,” Journ. f. d. r. u. a. Math., vol. 172 (1939): 
pp. 37-54; E. Witt, “ Riemann-Rochsecher Satz und Z-Funktion im Hyperkomplexen,” 
Mathematische Annalen, vol. 110 (1935), pp. 12-28. These papers will be referred 


as H and W, respectively. 


346 


| 
| 
= ( 
| 


REMARKS ON A SPECIAL CLASS OF ALGEBRAS. 347 


In the first case 7 necessarily is a field of characteristic x4 0. Moreover, 
T must be absolutely algebraic over its prime field 7'y for otherwise we could 
construct non-trivial valuations by means of a transcendence basis.? Let 
Tr< CTM + be an approximating tower of 
T over Tx. Then the formal least common multiple of the degrees [T): Tx] 
is called the Steinitz number of 7. The assumption that 7’ have for every 
integer n exactly one (cyclic) extension 7, implies then that the Steinitz 
number has no infinite component.* 

In the second case the field 7’ admits at least one non-trivial valuation V. 
If the value group of V is well-ordered in a suitable fashion then it can be 
shown that V is the composite of a rank 1 valuation V and a valuation V’ 
of the residue class field of V.4 Let 7’ be the complete closure of the field 7’ 
with respect to the valuation ’. In order to prove that the field 7’ is relatively 
complete with respect to the valuation )’ it suffices to show that 7’ contains 
no other elements algebraic over 7 but the elements of 7.° In other words, 
we must prove that 7’ is the universal decomposition field (with respect to V) 
of its algebraic closure. Let =a" + be an 
irreducible equation of degree n with coefficients in T. We associate with fn (2) 
a polynomial g,(2) =a" + bia"? 0b, with coefficients in T such 
that V(a; —bi) > M > 0, where M is sufficiently large. It can be shown 
that gn(x).—= 0 is irreducible in T and that its roots generate the same field 


over 7’ as the roots of fr(x) =0.° Since = 0 is also irreducible in T 
its roots generate the cyclic extension T, of degree n over 7’. Hence the field 
generated by fn(z) =0 is given as yi i.e. it is cyclic and has relative 
degree n. Thus 7’ is relatively complete with respect to the valuation V. 
If the valuation V is discrete (i.e., if its value group is isomorphic with the 
additive group of all integers), then 7’ =" by a theorem of F. K. Schmidt. 

In general. we can prove that 7 is relatively complete with respect to 
exactly one rank 1 valuation. Namely suppose that 7’ is relatively complete 


* A. Ostrowski, “ Untersuchungen zur arithmetischen Theorie der Kérper,” Mathe- 
matische Zeitschrift, vol. 39 (1934), pp. 269-404. 

*M. Moriya and O. F. G. Schilling, “ Zur Klassenkérpertheorie iiber unendlichen 
perfekten Kérpern,” Journal of the Fac. of Science Hokkaido Imperial University, 
Ser. I, vol. 5 (1937), pp. 189-205. 

‘W. Krull, “ Allgemeine Bewertungstheorie,” Journ. f. d. r. u. a. Math., vol. 167 
(1931), pp. 160-196. 

° A. Ostrowski, loc. cit. 

BAO? SRG Schilling, “ A generalization of local class field theory,” American 
Journal of Mathematics, vol. 60 (1938), pp. 667-704. 

"F. K. Schmidt, “ Mehrfach perfekte K6rper,” Mathematische Annalen, vol. 108 
(1933), pp. 1-25. 


le 
ns 

1 
se 
it 
al 
to 
1e 
ly 
of 
De 
al 
ly 
at 
re 
is 

| 
t 


348 0. F. G. SCHILLING. 


with respect to another rank 1 valuation V,;. Then we can construct irre- 
ducible equations h,(z) =0 with coefficients in T which have prescribed 
characters of decomposition with respect to V and V,8 Repeating the 
preceding argument we immediately see that the existence of a valuation V, 
with the asserted properties leads to a contradiction to the assumptions on 


the field 7. 


Remark. For the actual construction of fields 7’ see a paper of the 
author on formal power series of several variables.° 


Derinition. A field T is said to be quasi-algebraically closed if it is 
never the center of proper division algebras of finite rank.’° 


THEOREM 1. A field T which has only cyclic extensions is quasi- 


algebraically closed. 


Proof. First, our hypothesis implies that the field 7’ is algebraically 
perfect. Namely 7 is supposed to possess only cyclic extensions. Now it 
follows immediately, by a theorem of Albert, that 7’ never is the center of a 
proper division algebra of degree x”, x4 0." Thus it remains to discuss 


algebras A whose degrees n = J] p;* are relatively prime to the characteristic. 


The structure theory of algebras yields that A~ D, X- - - X D, where the 
algebras D; are normal division algebras of degrees p;”* over 7’, respectively; 
0<b;Sa;. We want to show Dj~T. Let Since T has, by 
hypothesis, only cyclic extensions, it follows that D; is split by a field 7 of 


degree pi, fj Sb Lett CTO < T be the 


chain of cyclic subfields of 7')/T such that [7:74] =p. We shall 
prove by induction that Di X T') ~T implies Dj ~T. Suppose that we 
already proved XTO~T®, Then Dy X ~ a), 
aj.~0in T%), If a;, is a p-th power nothing has to be proved. So let 
11 c%;,. Suppose that 7%) contains the p-th roots of unity. Then 
T TU) Consequently, 


®B. L. van der Waerden, Moderne Algebra, vol. I (Berlin, 1937), 2nd edition, 
pp. 201-202. 

°O. F. G. Schilling, “ Arithmetic in fields of formal power series in several 
variables,” Annals of Mathematics, vol. 38 (1937), pp. 551-576. 

*°Q. F. G. Schilling, “The structure of local class field theory,” American Journal 
of Mathematics, vol. 60 (1938), pp. 75-100. 


1A, A. Albert, “ Normal division algebras of degree p* over fields of characteristi¢ 


i 


p,” Transactions of the American Mathematical Society, vol. 39 (1936), pp. 183-188. | 


4 | 
| 
i 


REMARKS ON A SPECIAL CLASS OF ALGEBRAS. 349 


This similarity implies that ay, = i.e. Di The 
preceding argument can also be applied for p = 2 for our assumptions exclude 
that 7 is a totally real field. Suppose next that 7%” does not contain 
the p-th roots of unity ¢‘, A=1,- - -,p. Consider then the algebra 
DX TE) & TE) over TY (€) as groundfield. Since (g): 
=p—1, it follows that 7 (€) is a splitting field of the extended algebra. 
As before we conclude that is a splitting field of Di (). 
But this is impossible if K TU) 


2. Foundations of local class field theory of discrete complete fields. 

Let C be a field which is complete with respect to a rank 1 valuation p 
and has the field 7 as residue class field. Since Hensel’s Lemma holds for C 
it follows that the unramified extensions C,, of degree n over C are in (1 — 1)- 
correspondence with the extensions 7, of 7. Consequently the generating 
automorphisms /’, of the various Galois groups G(C,|C) can be selected such 
that they induce the generating automorphisms of the Galois groups G(7',| 7). 
Let Coo denote the maximal unramified extension of C. The Galois group 
G(Co|C) is an ideal cyclic group. Selecting once and for all an element F 
in G(Co|C) we observe that the infinite cyclic group {F*, A running over 
the additive group of all integers} is everywhere dense in G(Co|C). Con- 
sequently the element F' induces for every n a generating automorphism F’,, 
of G(C,|C).1° Having fixed the automorphism F, the substitutions Ff, have 
the same algebraic properties as the Frobenius automorphisms of the classical 


ramification theory. 
In order to derive the local class field theory relative to the field C it is 
sufficient to prove the following lemma. 


LEMMA 2. All units of C are norms of units in Cp. 


Proof. Let u be an arbitrary unit of C. Then, by Theorem 1, its residue 
class umod p in 7 is the norm of an element R of 7, i.e. u==NR (mod p). 
Thus we have a first p-adic approximation of was a norm of a unit U=R (modp) 
in C. Since u(NU)-! = 1 (mod p), the customary procedure of p-adic 
approximation yields that u(NU)-i = NH, where // =1 (mod p)."™ 

The usual arguments of local class field theory imply that Lemma 2 yields 


the following theorem. 


THEOREM 2, Every normal simple algebra A over C is similar to a 


_ 0. F. G. Schilling, “Regular normal extensions over complete fields,” To appear 
in the Annals of Mathematics. 

“E. Witt, “Schiefkérper iiber diskret bewerteten Kérpern,” Journ. f. d. r. u. a. 
Math., vol. 176 (1937), pp. 153-156. 


re- 
ed 
he 
on 
he 
48 
St- 
lly 
it 
fa 
iss 
ic, 
he 
| y; 
by 
of 
he |. 
all | 
we 
let 
en 
ic } 
8, 


350 0. F. G. SCHILLING. 


cyclic algebra (C,/C, Fn, 7)” where x denotes a fixed prime element of the 


valuation 


As in the classical theory we define the residue v/n mod 1 as the invariant 
of the algebra A. Having selected F and zw every algebra A is uniquely 


determined in its class by its invariant. 


3. Algebras in the large. Let & be an arbitrary function field of one 
variable with coefficients in the field 7. Now let A be an arbitrary normal 
simple algebra over k as groundfield. Since & is a function field of one variable 
it follows, by a theorem of T'sen, that the algebraically closed field Tx of T 
suffices as a splitting field of A when adjoined to &.’° Hence a suitable finite 


extension kT’, of k already splits the given algebra A, 


Thus, A ~ Pn, a), a0 in k. 
We next want to determine the local invariants r(p) of the algebra A/h." 
These characters r(p) of A are uniquely determined by virtue of Theorem 2. 


First let us observe that 
(kT Pi, a) ~ (kT Fn, ab) for any b~0 in T. 
Namely, Fn, b) ~ (Tn/T, Fn, 6) Xk ~k. Consequently the struc- 
8 
ture of A depends only on the divisor (a) = Since A ~ (kT), /k, Fn, 4), 
i=1 
we get 
(Pnky/ky, Frnt, 
where d = (n, f(p)) and e=f(p)d"'. Here f(p) denotes the absolute degree 
of the prime divisor p. As a consequence of Lemma 2 and Theorem ? 
we find that the algebra A, is completely determined by the invariant 
r(p) = f(p)a(p)n (mod 1), where pt) //(a). We remark _ that 
r(p) =0 (mod 1) if pt (a) ; namely then a is a unit for the prime divisor . 
Hence, by Lemma 2, A4y~4h,. Therefore, the algebra A is ramified at most 
at the prime divisors of (a). As usual it follows that 


>r(p) =0 (mod 1) for the invariants of an arbitrary algebra 4//, 
(p) 


™C. Chevalley, “ La théorie du symbole de restes normiques,” Journ. f. d. r. 4. 4 
Math., vol. 169 (1933), pp. 140-157. 

* Ch. C. Tsen, “ Divisionsalgebren iiber Funktionenkérpern,” Nach, v. d. Gesell. 
d. Wiss. Gottingen (1933), pp. 335-339. 

*° The local invariants r()) are defined to be the invariants of the limit algebras 
A, of A. 


| ( 
f 
f 
i 
(| 
| 
| 
q I 
th 
Ve 
a 


REMARKS ON A SPECIAL CLASS OF ALGEBRAS. 351 


8 
for © fi(p)a%i(p) = 0."7 Thus, we established a generalization of the classical 
4=1 


norm theorem. 

Suppose now that & is a rational function field T(x). Then the prime 
divisors p at finite distance with respect to « can be represented by irreducible 
polynomials z of degree f(p) with respect to x. Moreover, every divisor of 
degree 0 in /} = T(x) is a principal divisor, i. e. it belongs to an element of k. 


We then can prove that for every finite set p,,- --,). of prime divisors to 
which there are associated rational fractions a;b;-! whose sum is 0, there 
exists an algebra (7',k/k, Fu, a) whose local invariants r(pi;) = 


r(p) = () if p-~P.,° De. 


8 
To prove this assertion we proceed as follows..* Put n = [J bif(pi) and 


8 
a; = ain(bif (pi) )7, 7=1,2,- Then the divisor [J pi* has the order 
i=1 


> aif (ps) = (bif (pi) )*f (i) = ibe" = (), 


8 
Consequently, by assumption, J] = (a). Hence the algebra(T,k/k, F'n, a) 
i=l 


obviously has all the required properties. 

If the genus of k is greater than 0 then one can readily construct examples 
for which there exist no algebras with prescribed invariants. Namely, take 
for k a field of genus > 0 whose defining equation f(x,y) = 0 has coefficients 
in the field of all complex numbers (. There exist then infinitely many 
divisor classes whose orders are infinite. Selecting the p; and ab, appro- 
priately one easily can construct the necessary counter examples. 


We now want to prove that every division algebra 
D~ (kT Fn, a) is ramified. 


8 

Let a= 0 < <n, where the are irreducible polynomials in 
i=1 

belonging as uniformizing variables to the (finite) prime divisors p;. Then 


the algebra (k7',/k, F,,a) is similar to the direct product of the s algebras 


Hach one of these algebras is at most ramified at py and px, where poo denotes 
the denominator of «. The invariants r(p;) of A are the same as the in- 
variants of A, at p; according to the structure of local algebras. Thus, 
‘finite prime divisor p; gives rise to a local division algebra, if and only if 


p. 45. 
Theorem 18. 2%. 


t 
e § 
| 
e 
e 
t 


O. F. G. SCHILLING. 


aif (pi) (mod n). 


Now let us prove that af(p;) =0 (mod n) implies Aj ~k. 
Let (Tnk/k, Pn, 2%) be such an algebra. Denote (f,n) by d. Then 


with prime divisors %; in 7,k. All these prime divisors Bj = (II;) are 
principal for T,k = We have 


NP, = p"/4 = (x) "/4, 


Next = Hence (x)* = Namely, there exist integers 
p, v, such that nd-"p + fd“v—1. Now, as a consequence of af =0 (mod n), 
af==gn. Whence we get afd = Consequently, 


N = (3) 90/4 (a) 21/4, and 
= (3) o"/4, 

Hence 
N = (pj? = 4. 


Since units are irrelevant for the structure of factor sets in cyclic algebras, 
we get 
== NII,F, or 
(Tnk/k, Fn, x) ~k if af =0 (mod n). 


Consequently, the algebras A; for which af(p;) = 0 (mod m) can be omitted 
in the representation of the algebra A. Combining these results we find that 
a cyclic product A which is similar to a proper division algebra over / must 
have at least two ramifications. 

As usual we have ’° 


THEOREM 3. The class group of normal algebras over k=T(x) & 
isomorphic with a subgroup S of the additive group {r(p)} of all vectors of 
rational numbers mod1. The group S consists of all vectors for which 
7(P) = (mod1) and r(p) =0 for almost all p. 


Finally, we remark that the index and exponent of any normal algebra 
over 7'(x) conicide.”° 


UNIVERSITY OF CHICAGO. 


19°'W, Theorem 19, p. 27. 
2°'W, Theorem 20, p. 28. 


352 
| 
‘ 
| 
| 


re 


rs 


ON A CERTAIN PARTITION FUNCTION.* 


By Ivan Niven.** 


1. Introduction. It has been shown by Schur? (and by Gleissberg ? 
with a different method) that the number a» of partitions of an integer m 
with summands of the form 6n + 1 equals the number of partitions of m 
such that the difference between any two summands is at least three, and at 
least six in case both summands are divisible by three. The purpose of the 
present paper is the evaluation of the dm which may be considered as the 
coefficients of the powers of x in the expansion of the function 


(1.1) F(z) f(a) Amt 
as a power series, where 

(1.2) f(2) 


That a represents the number of partitions of the integer m having summands 
of the form 6n + 1 is immediately verified by expanding F(z). 

The method used is essentially that employed by Professor Rademacher * 
in his investigation of the modular function J(r). The author takes this 
opportunity to thank Professor Rademacher for suggesting the problem and 
for advice on its solution. 


2. Transformation formulas. We employ the familiar transformation 
formula 4 


(2.1) f} exp ait) 


= V exp (4-*) ( f ) exp (2 ) 


* Received May 5, 1939. 

** Harrison Research Fellow. 

‘TI. Schur, “ Zur additiven Zahlentheorie,” Sitzwngsberichte der Berliner Akademie, 
1926, pp. 488-495. 

*“ Uber einen Satz von Herrn I. Schur,” Mathematische Zeitschrift, vol. 28 (1928), 
pp. 372-382. 

*Hans Rademacher, “The Fourier coefficients of the modular invariant J(r),” 
American Journal of Mathematics, vol. 60 (1938), pp. 501-512. 

*G. H. Hardy and 8S. Ramanujan, “ Asymptotic formulae in combinatory analysis,” 


Proceedings of the London Mathematical Society (2), vol. 17 (1918), pp. 75-115, 
Lemma 4. 31, 


353 


= 
_| 
| 
at | 
st 
is 
| 
| | 
| | 


354 IVAN NIVEN. 


in which 
(2. 2) (h,k) =1, hh’ =—1 (modk), 
and wa,x is a root of unity frequently used in modular function theory. H ardy 
and Ramanujan ° give the values 
(2. 3) On, k == exp(— mi{ 4 (2 — hk —h) 
+ Yo(k —1/k) (2h —W + h?h’)}) 
for h odd, and 
(2.4) ong = (—h|k) exp(— rif 4 (k —1) 
+ Yo(k—1/k) (2h + h?h’)}) 
for k odd. We wish to obtain a transformation formula similar to (2.1) 
for the function 


by applying (2.1) to (1.1). There arise four cases, according as (k,6) has 
the value 6, 3, 2, or 1, and the value of the function (2.5) is, respectively, 


(2.7) (2) exp (xi 


In the last three cases, h’ is a solution of the congruence hh’ ==— 1 (mod k) 
such that it is divisible by 2, 3, and 6 respectively; clearly this is possible 
because of the divisibility properties of & in the various cases. Also we have 


( ) Onx Yx(z) = exp P (1/z + 2z) ( for (k, 6) 


Oh, k/2M3h,k 


(1/32 + 2) for (k, 6) = 2, ant 


(2.13) Oy, — exp } — (—(1/3z) + 22) {for 6) = 


Woh, kKW3h,k 


5 Loc. cit., p. 85. 


| 

| | 
| | 
i 

| 
| 

j 

| 

| 


Ot 


ON A CERTAIN PARTITION FUNCTION. 35! 


38. Applying Cauchy’s Integral Formula to (1.1) we obtain 


(3.1) dm = wth dx 


Osh 


wherein 3’ means that h is summed over values prime to k. We choose C 
to be the circle | x | = exp(— 27N-*), so that the Farey are é,x is given by 


whete — Onn Onn; and, if hy/k,, are the neighbours on the left 
and right respectively of h/k in the Farey series of order N, 


1 1 


(3. 2) Onn = 


Thus we have 


>’ f F exp — — ig) ) 


OSh< 


Qrih 


X exp —m (— + dd, 


the subscripts being omitted from 6’, and 07%. Now set w = N-* — 1# and 
t= kw. 


(3. 3) Om = exp Qarih 


h.k k 
OSh k=N 


x exp exp (2rmw)d¢. 
| k ( 


In order to make use of the formulas (2.6), (2.7), (2.8), and (2.9), 
we break a,, above into the parts dm, dm), dm, and am according as 
(k,6) equals 6, 3, 2, 1 respectively. Applying (2.6) to an‘, we have 


(3.4) SY’ exp (- On, 
hk 


OSh < k=N 
(/-,6)=6 


x f dn exp + 2xmw dd. 
n=0 k k?w 


Splitting off the first term in the summation in the integrand, we write, 
making use of the fact that a) = 1, and (2.10), 


N 
(3.5) D’ exp (— 


nw 


18 

\ 
| 
| 
e 
e 


356 IVAN NIVEN. 


Hence am?) =I, + I, being the same as (3.4) with the summation in 


the integrand ranging from 1 to o. 


4, The evaluation of I,. Formulas (2.3) and (2.10) imply 


(4. 1) — exp} { tor (k,6) = 
where h’ has been chosen so that hh’ ==—1(mod 12k). Using 
1 1 1 1 
42) = < 


we may write 


N 
(43) Son f exp} — (2m —%) + xw(2m — ¥) 
(,6)=6 ; 
k(N+k) 
exp} — (mh — nh’) 


N 00 i N+k-1 
+> Ya. D exp; (mh—nh’) - f 
k=1 n=1 h l=k,+k 
(k,6)=6 1 

kl 
1 
N N+k-1 
oO 2 +k— 
k=1 n=1 hmodk k l=kgtk 
(k,6)=6 1 
k(1+1) 
8; + + 83, 


where the integrand is the same in all three expressions. By (4.1) the inner 
sum in S; is the Kloosterman sum 
The quantities in parentheses are integers since k is divisible by 6. Since we 
required hh’ ==— 1(mod 12k), this sum may be considered as an incomplete 
sum mod 12k; using a device of Estermann,® and an estimate of Salié,’ the 
sum (4.4) is 
O (k?/5** (12m — 1 — k?/3, 


°T. Estermann, “ Vereinfachter Beweis eines Satzes von Kloosterman,” Abhand- 
lungen Hamburg. Math. Seminar, vol. 7 (1929), p. 94. 

7H. Salié, “ Zur Abschiitzung der Fourierkoeffizienten ganzer Modulformen,” Mathe- 
matische Zeitschrift, vol. 36 (1933), p. 264. 


| 
1 
k(N+k) 
{ 
| 
1 
3 k( 
| 


ON A CERTAIN PARTITION FUNCTION. 357 
Using the inequality 


== k?(N-* + k?N-? + x 9 
we obtain 
exp | + 1N*(2m— ¥%) ) 
k=1 
= 0 (5 exp (2rmN~) My eXp (— Ske ve) 
N n=1 2 k=1 
=0 (5 exp (2rmN-*) manele) 
(4.6) S, = O(exp(2rmN-*) 


Because of the similarity of S, and S;, we shall treat only the latter. 
Interchanging the summations with respect to h and | we get 


N o N+k-1 


, k? 
exp) — 7) (—12+1—F)| 


In order to interpret the restriction on k, in the inner sum, we recall from 
the theory of Farey series that if 


are three neighbors in the Farey series of order N, then 


hk, —h,k = 1 = hak — hk, 


so that 

hk, =— hkz =1 (mod k), 
or, by applying (2. 2) 
(4. 7) —k, =k, =N’ (modk). 


From this it follows that the above restriction on /, implies a restriction on h’ 
to an interval mod i equivalent to one or two intervals in the range 0 Sh’ < k. 
If the Kloosterman sum is considered mod 12k, we have a restriction on both 
h and h’. We proceed to remove the restriction on h, so that the sum may be 
treated as was (4.4). 


1 
| 
1 
kl 
(k,6)=6 1 
k(U+1) 


358 IVAN NIVEN. 


We shall prove that if the sum 


(4. 8) exp 12k h{ 12m —1 3 
+h 12n + 1—— 


is multiplied by 
| 
(4. 9) 1+ exp al (12m —1— =) 
+ ple ( en | 


where = «@ or 7% according as (k,12) = 12 or 6, the product is 


2 k? 
+- h’ (— 12n + 1— 
where h’ satisfies the congruence hh’ = — 1(mod 12/). We must prove that 


(h + ak) (h’ + Bk) =— 1(mod 12k), 
or, multiplying by h, 
Bh? —a-+ aBhk = 0(mod 12). 
If (4,12) = 12, we use 8 = 2 and require 
— 1) =0(mod 12) 
which is obviously true since h is prime to & which is divisible by 6. If 
(k,12) = 6, we use B = and require 
a(th? —1) + Va*hk =0(mod 12) 
which reduces, since 7h? — 1 =6(mod 12), to 
a(6 + Vakh) =0(mod 12). 


This is true since the factor in parentheses is divisible by 6 when = is even, 
and by 12 when @ is odd. The argument following (4.4) shows that (4. 10), 
and hence (4.8), equals 


(4.11) O( — 1 — k*/3, 12k) /*) = 


Thus 
X exp = 4+ aN-2(2m — %) 


I, 


jh 


mi 


| 
( 
( 
] 
al 
W 
| 
| 
(: 
W 


ON A CERTAIN PARTITION FUNCTION. 359 
(4. 12) S, = O(exp(2rmN-) , 
Combining (4.3), (4.6), and (4.12), we obtain 
(4. 13) I, = O(exp(2rmN-?) +e) 


5. The evaluation of I,. We divide the expression /, in (3.5) into 
three parts Yo(m), Qi(m), and Q.(m) by splitting the limits of integration 


into the parts 


1 1 
k(N a i(k, +h)? 


and 
k(N +h)’ 
Thus 
ki 
Qo(m) = D> B,.(m) f exp Fp + rw (2m — ) dd, 
where 


(5. 1) By(m) = exp (- for (hk, 6) = 6. 


h mod k 
Let R be the positive circuit of the rectangular path with vertices 
k(N +k) 


Then, using w= N-* — ig, the integral in Q,(m) above may be equated to 


1 1 1 f 
1e 
i i i 
N-3+ _N-24 -N-2. 


= (J,+J.+J3), 


where all four integrals have the same integrand. 


The integral 
1 . 
P (Zz ( )) 


may be expressed in terms of well known integrals from the theory of Bessel 


= 


| 


360 IVAN NIVEN. 


functions. For the Bessel function of the first kind with purely imaginary 
argument * we have 


(5. 3) Ip(2) exp(t+— ) dt 
and 
2 

(5. 4) Ip-1() —Ipss(z) == Ip(2); 
the latter being used with p—0. From these we obtain 

1 12m —1 
5.5 =| ( }. 
( ) ) kV 12m —1 1 3k 


Now, along the paths of — in J, and J; we have 


< < N-2 
w= wt ky’ 
It follows that 


2 
+ 
so that the absolute value of the integrand in each of J, and J; is 


S exp(%m + 2xmN-*) 


and hence 
| | \ <= 2N-* exp(%4r + 2nmN-). 
3 
In J, we have w = — N-* + iv, where 
1 1 
= 
It follows that 
2 — N* 
R(w) =—N? <0, R(1/w) = <% 
so that the absolute value of the integrand is less than unity, whence 
-1N-1 
Collecting the last few results, and recalling that 
B,(m) = O , 
we obtain 
N 
(5.6) Qo(m) =24 Dd By(m)Ly(m) + O(exp(2emN-*) 


(k,6)=6 


8G. N. Watson, Theory of Bessel Functions (Cambridge, 1922), p. 181, (1), 
p. 79, (1). 


Ir 


an 


We 


Si 
an 


P 


ON A CERTAIN PARTITION FUNCTION. 361 


We now treat Q,(m), and point out that Q2(m) may be handled analo- 
gously. We had 


hmodk 
(k,6)=6 


~ 
N+k-1 
x f exp + (2m )) dd. 
2 
kL 


Interchanging the summations with respect to / and h, we obtain 


1 
k( +1) 


N+k-1 

(5.7) Q(m= exp (—"— + ww(2m — %) ) do 
k=1 l=k +k 6h?w 
(k,6)=6 


1 


kl 


h mod k 
N < 


The inner sum may be treated as was (4.8), yielding 


O ) 
Noting that 


)- N- 
6k? (N-* + ¢?) 
us T 


= 6k?N?(N-* + + k)-?) = + 14) 


and 
R(mrw(2m — %)) < 
we conclude that 


exp (2amN-?) 


== () 
(> k(U+ 1) § 
= (0) b> exp (2am 


k-1 ] 


l=k,+k Li 


Since a similar result is valid for Q2.(m), we combine this result with (4. 13) 
and (5.6) to obtain 


N 
(5.8) S By(m)Ly(m) + O(exp (2rmN-2) , 
k=1 


(k,6)=6 


6. Estimations for a»‘*) and a,»‘?). If the power series expansion for 


OO 
is byw" then is, from (3.3) and (2.11), 


n=0 


9 


— 

- 

| 


362 IVAN NIVEN. 


(6.1) 35. exp — (2hm —h’n) 
n=0 = k | 
.6)=3 


9” 
exp Yo) + rw(2m— %) 


This differs from a,‘ in that the coefficient of 1/w in the exponent of the 
integrand is always negative, whereas in a,,‘°) it was positive for n = 0, which 
forced us to consider this case separately (section 5). It is clear, then, that 
we proceed with (6.1) as we did with J, in section 4. Formulas (2.4) and 
(2.11) imply 


2 
(6.2) exp — } tor (k, 6) =3 


provided that h’ is chosen so that 
hh’ =— 1 mod 3k. 
Corresponding to (4.4) we now have 


> § k? + 3 h’ + 3 
(6.3) > exp | — 3h (3m — 


h mod k 


Since (k,6) = 3, (k? + 3) is divisible by 12. Also we may choose h’ to be 
even, restricting it to the interval 0 << h’ < 6k. Then set h” = h’/2, so that 


< 3k and hh” = mod 3k. 


2 


The sum (6.3) is then an incomplete Kloosterman sum mod 3k, since for 
hh” =amodk, (a,k) =1 


>’ exp (uh + vh’’) \ >’ exp (uh + avh’) 


h mod k lk h mod k 
where hh’ ==1modk. Corresponding to (4.10) we have 


k? +3 h’ ke? + *) 
6. » + —( — 3n— 
exp 3k (3m 12 ) 2 ( 2 


h mod k 
N <k+keSl 


to which we apply the argument following (4.10). Thus 
(6. 5) Am) = O(exp(2amN-*) N-/9) 


The quantity 


N 
Oy Dd’ exp — (3hm — h’n) On, 
k=1 n=0 hmod k 3k 
(k,6)=2 


| 
‘ 
| 


ON A CERTAIN PARTITION FUNCTION. 363 
derived from formulas (2.8), (2.12), and (3.3), may be treated similarly. 
From (2.3) and (2.12) we obtain 

—exp | t for (k,6) = 2 
so that the Kloosterman sum is 


h mod k 
N 


where 


(k,6) =2 and hh’ or 


1 
3 3 mod 4k 


according as k==1 or 2 mod3. This sum is incomplete mod 4k, and we 
conclude that 
(6. 6) Om?) = O(exp(2rmN , 


7. The evaluation of a,,“). The quantity 


(7.1) Ban exp — —— (6hm—h’n) One 
k=1 n=0 hmodk 6k 
(k,6)=1 


1— 12 (9m— 
xf, 12n) +- rw (2m ) ap 


resembles @»‘°) in that the coefficient of 1/w in the exponent of the integrand 
is positive for n = 0, so that the first term of the inner sum must be treated 
separately, which treatment follows that of section 5. Evaluating OQ, with 
the aid of (2.4) and (2.13), we have 


h mod k / ke ; ‘ 3k j 

We may choose h’ divisible by 6 and satisfying 


hh’ == — 1 mod k 


since (k,6) =1. The Kloosterman sum is 


7.2 f k? —1 k?—1\) ) 
( ) h k h m 12 + 6 ( 


Note that (k? —1) is divisible by 12. Then h’/6 may be replaced by an 
integer AIV satisfying 


=" or mod k 


h | 

| 

| 

| 

t 

| 


364 IVAN NIVEN. 


according as k==1 or —1mod6. Thus we obtain a result similar to (5. 8), 


AY 


(7.3) dm) = 24 S By(m)L,(m) + O(exp(2xmN~) , 


(k,6)=1 
wherein 
1 arV 12m — 6 
7.4 = ( ) 
kV72m—6 18k 


and B,(m) has the same form as in (5.1) but with (4,6) now equal to 
unity, Qn, being defined by (7.2) in this case. 

We now collect our results. Formulas (5.8), (6.5), (6.6), and (7.3) 
combine to give 


(7.5) dm =m) + + dm + dm™ 
N 
= 20 > By(m) (m) + O(exp(2amN~*) 
k=1 


provided we set 
(7. 6) I,.(m) =0 when (k,6) =3 or 2. 


In (7.5) we hold m fixed and let N become infinite so that the error term 
becomes zero, and dm is expressed by the convergent series 


the various quantities in this result being given by formulas (5.1) with (4.1) 
and (7.2), (5.5), (7.4), and (7%. 6). 


UNIVERSITY OF PENNSYLVANIA. 


| 
k=1 | 
\ 
t 
i 
a 
h 
b 
a 
n 
d 
g 
D 
th 
( 
tl 


FINITE METABELIAN GROUPS AND PLUCKER LINE- 
COORDINATES.* 


By H. R. BRAHANA. 


1. Introduction. We are concerned with finite metabelian groups whose 
operators are all, except identity, of order p. The metabelian groups are those 
whose commutators are all in the central. If @ is such a group and C is its 
central, then G/C is abelian and of type 1,1,---. Corresponding to every 
subgroup of G/C there is a subgroup of G which is either abelian or metabelian. 
Whenever G/C is the direct product of two of its subgroups, T, and T2, which 
are such that the corresponding subgroups G; and G2 of G are both abelian, 
then @ is a subgroup of the holomorph of (, and also a subgroup of the holo- 
morph of G,. Conversely, when @ is a subgroup of the holomorph of one of 
its abelian subgroups, then G/C is the direct product of two subgroups, T; 
and T,, which correspond to abelian subgroups of G. A method. of classifica- 
tion of groups possessing this property has been given. Our main interest 
here is in the groups which do not possess this property. 

A group G@ which is not abelian has a commutator subgroup which is not 
identity. The central C either coincides with this commutator subgroup or 
else is the direct product of it and an abelian subgroup C’ which is of type 
1,1,-- +. In the latter case G itself is the direct product of C’ and a meta- 
belian group G’ which has all the interesting properties of G.*2 We shall 
assume in what follows that the central and the commutator subgroup of G 
coincide, 

The abelian subgroup C of @ is not maximal abelian, for the group {C, s}, 
where s is any operator of G not in C, is abelian. The group {C,s} may or 
may not be maximal abelian. Whether or not {(,s} is maximal abelian will 
depend in general on the choice of s. The possession of a maximal abelian 
subgroup {C', s} will be a characteristic property of G. We propose to investi- 
gate such groups @ as have a maximal abelian subgroup {C,s}.° This class 

* Received December 12, 1938; Revised September 22, 1939. 

*Cf. for references H. R. Brahana, “ Metabelian groups and trilinear forms,” 
Duke Mathematical Journal, vol. 1 (1935), pp. 185-197. 

*More specifically, if two groups have the same order and possess the same G’, 
then they are simply isomorphic. 

*Cf. American Journal of Mathematics, vol. 56 (1934), p. 496. The theorem 
(5.2) which purports to deal with these groups is incorrect. We regret the incorrect 
theorem and the erroneous proof. The theorem was beside the main line of development 
of the paper and those which follow it. 

365 


366 H. R. BRAHANA. 


of groups contains most of the groups which have been considered in the 


papers referred to above. However, our present investigation will be greatly 


facilitated by the work which has been done, for once it is recognized that G 
belongs to the holomorph of one of its abelian subgroups @ is quickly identified 
by reference to those papers. 

We denote the order of C by p*; we denote the maximal abelian subgroup 
{C,s} of order bypH. Then we suppose G to be {H, Ux}. 
The order of @ is If H is transformed by U = {U,,U2,- 
there is obtained a commutator subgroup, which lies in C. The fact that 1 
ig maximal abelian requires that the commutator subgroup obtained by trans- 
forming // by U be of order p*. Hence we must have c=k. If U were 
abelian, @ would be in the holomorph of //. We may therefore suppose that 
U is non-abelian. U then contains a commutator subgroup of order p! where 
1=1. If the commutators of pairs of U;’s are all independent, then 
l=k(k—1)/2. Since C is the commutator subgroup as well as the central 


of G, we have ScSk(k + 1)/2. We note that is generated by the 


U;’s and s. The numbers i: and ¢ are characteristic for a given G. We shall 
consider the groups G for a given i: in subclasses according to the values of ¢. 

When c= k(k + 1)/2 the group is completely determined by the number 
k-, since two such groups with the same k may be made simply isomorphic by 
letting generators correspond and letting commutators of corresponding pairs 
of generators correspond. We shall refer to this group as the master group 
for a given k. Such a group contains no abelian subgroup of order p°™. 
For all other values of ¢ there exist groups G which contain abelian subgroups 
of order p***. The possession of such subgroups is a characteristic property 
of a group and hence any set of invariants which determines the group must 
determine the number of such abelian subgroups contained in the group. 
Our method is to examine @ for its abelian subgroups. This brings into 
prominence a certain matrix M whose properties give a set of definitive 
properties of G for k <4. The elements of M are linear forms in certain 
indeterminates and the existence of certain abelian subgroups of @ implies 
the existence of sets of values for the indeterminates which determine certain 
ranks for M. This investigation is carried out in §2 for k < 4. 

In §3 the problem of the classification of these groups is approached from 
another direction, and for the case k =3 is seen to be closely connected with 
Pliicker line-codrdinates in a finite three-space. The geometric formulation 


ef the problem gives it an appearance of simplicity which would be misleading 
were we not warned by the intricacy of the considerations in § 2. The expos 
tion of the close connection between these two seemingly distinct subjects 1s, 
of course, the important contribution of this paper. 


d 
st 
0 
( 
I 
( 
gr 
al 
su 
i 
m 
th 
If 
fol 
Th 
wh 
svg 
/ 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES., 367 


2. The groups G for k=3. When k thenc=—1. is of order 
p’ and there is only one such metabelian group. This is the master group 
described in the preceding paragraph, and is otherwise well-known.! 


When k = 2, then ¢ is 2 or 3. There is but one group for c = 3 as was 


seen above. For c= 2 there is a group @ which belongs to the holomorph 
of H. This group is generated by s, U,, and Uz which satisfy the following 
relations with «= B = 0: 


Us =U 4s,%8,8, 


(1) U.'sU . = 88>. 


H = {s, 8;, 82} is of order p* since s is not in C and // is maximal abelian. 

Any group G@ with k = 2 is generated by operators satisfying (1) with 
aand 8 having suitable values. Two groups generated by operators satisfying 
(1) both having «=~ = 0 are obviously simply isomorphic. Hence for a 
group G, not in the holomorph of 7, not both « 0 and B=0. The sub- 
group of the holomorph of H described in the last paragraph contains the 
abelian subgroup {C,U,,U2} whose order is p*? = p*. If a and B exist, 
such that G contains no abelian subgroup of order p*, then the resulting G 
will be distinct from the one already obtained. An abelian subgroup of order 
p* will contain two independent operators which are not in C and neither is 
in the group generated by the other and C. Every operator of G can be 
written in the form c,s?U,*U.', where c; is some operator in C. The com- 
mutator of this and any other operator of G is independent of ci; hence for 
the purpose of investigating commutators we may assume that cj; = 1. Let 
st KUL", 1,2. Then 


If V; and V, are permutable this commutator is identity and we have the 
following congruences, mod p: 


— dok, + @ (kyl, — kol,) =0, 
Ayl, —- el, + — kel,) 


These are linear in a2, k,l. The matrix of coefficients is 


— fl, a+ 


whose rank is of course at most 2. Hence there is always a solution of the 


system of congruences. This corresponds to the fact that V, is permutable 


‘This is the only non-abelian group of order p*® which contains operators of order 
Ponly. It is to be noted that p 2, since the only groups whose operators are all 


‘xcept identity of order 2 are the abelian groups of type 1,1,- - - 


| 
| 
| 


368 H. R. BRAHANA, 


with any power of itself. If V. exists and is independent of and permutable 
with V;, then for some V, the above system must have two independent 
solutions and the rank of M must be at most 1. This requires that 


+ a1], —al,? =0, 
ak,l, — ak, 


Since H is maximal abelian we cannot have k, = 1, 0, and hence the two 
conditions reduce to Bk, + a,—al,=0. Therefore, whatever the values of 
a and £, a,k,,1, may be found such that the rank of M is 1. Consequently, 
if 2, contains an abelian subgroup of order p**?. 

The abelian group {C,V;, V2} does not contain, s, for in that case H 
would not be maximal abelian. Hence @ is generated by s, V;, and V» which 
satisfy the relations (1) with «—,£—0. @ is therefore a subgroup of the 
holomorph of H whenever c = 2. 

When k= 3, we have 3c=6. There is one and only one group 
for c= 6. The simpler cases are those for which c is large; we consider the 
case where c=5. The commutators obtained by transforming s by Ui, U2, 
and U; are independent and at least two of those obtained by transforming 
one U; by another are independent of each other and the three preceding 
commutators. We may therefore suppose that generators of G@ satisfy the 
relations : 

(2) 2 = 882, U;7U,U3 = U 84, 
= 883, = U25s. 


There exists one such group with «= B=y=—8=—e=0. This group 
contains the abelian subgroup {C,U,, U2} whose order is p%*?. Conversely, 
any group G@ with k =3 and c=5, which contains HT as a maximal abelian 
subgroup and contains also an abelian subgroup of order p%*?, is simply iso- 
morphic with this group, for the abelian subgroup of order p**? contains two 
operators V, and V» such that {C, Vi, V2} is of order p®*? and does not con- 
tain H. Then V,; and Vz may be used for U,; and UV; in relations (2). 

Therefore, if any other group exists, it must contain no abelian subgroup 
of order Let 1=1,2. The condition that & 
have an abelian subgroup of order p*? is that there exist V, and V. which are 
independent and permutable. This leads as before to a set of congruences 
bilinear in the two sets of exponents. Considering these as linear congruences 
in the exponents of V, and writing the condition that the system have 4 
solution, we obtain the matrix 


i 

‘ 
( 
( 

I 

t 

V 

0 

f 

g 
ti 

n 

T 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES. 369 


(—k, a, — al, ak, O ) 
— Bl, a4+ Bk, 0 
M = — M, — yl, yk, ay . 
0 —m,— 4, dk, 


If V; and V, independent and permutable exist it is necessary and sufficient 
that a, k,,1;, m, exist such that the rank of M is at most 2. Now a solution 
of the system of congruences is a set of values for ds, kz, ls, m2 which define a 
VY, permutable with another operator and hence is a set of values which when 
used in place of a, /:,1;,m, in M reduce the rank to 2. Also the existence 
of such a V, implies that any operator of {Vi, V2} when expressed in terms 
cf s, U,, Uz, U; has a set of exponents which will reduce the rank of M to 2. 
Consequently, if there exists a V,; which reduces the rank of M to 2, there 
exists a with m; which reduces the rank of M to 2. Letting m,; = 0 
and recalling that we may not have a, = k, =1, = 0 also, we obtain 


y + =0 


as the condition that it be possible to reduce the rank of M to 2. Since there 
exist numbers a, 8, y, 8, « which do not satisfy this condition there exist 
groups G with c = 5 which contain no abelian subgroup of order p**?. 

We now determine a canonical form for a set of generating relations for 
the group with c = 5, and no abelian subgroup of order p***, and in so doing 
show that these conditions are sufficient to determine the group. This canoni- 
cal form is a particular set of values for «, 8, y, 6, «. This set of numbers 
(determines the expression of the commutator of U, and U2 in terms of the 


commutators of the other pairs of generators. ‘Taking 5), S2,°°*,S5 as defined 
by (2), we see that the commutator of U, and Uz cannot be in {54,85} for 
then the group {U,, U2, U3} would be that metabelian group we considered 
with k = 2 and c = 2 and hence would contain an abelian subgroup of order 
p’. Though this group is of order p*, it would determine an abelian subgroup 
of order p**? of G, namely, the direct product of the group of order p* and 
{81, 82,83}. Therefore the commutator subgroup of {U,, U2, U;} has a sub- 
group of order p in common with {5), S2, s;}. Every operator of the commu- 
tator subgroup of {U,, U2, U;} is a commutator, for otherwise some quotient 
group of {U,, U.,U;} would be a metabelian group with k = 2, c = 2, and 
no abelian subgroup of order p*. We have seen that no such group exists. 
Therefore, U’,; and U’, may be chosen in U so that their commutator is in 
{81, 82,83}. Let #;°’be the commutator of U’; and s. The commutator of U’; 


and U’, cannot be in {s’;, 82}; for then {//, U’,, U’.} would have a commu- 


370 H. R. BRAHANA, ' 


tator subgroup of order p* and hence would have an abelian subgroup of order 
pe. We may therefore denote the commutator of U’,; and U’s by #3, and 
then find in U an operator U’; independent of U’, and U’, which with s has 
s’, for a commutator. Therefore, any group having the given properties 
has generators which satisfy (2) with «= 8B =&8=e=0 and y=1. Thus 
there are just two groups with k = 3 and c = 5, and they are distinguished 
by the fact that one contains an abelian subgroup of order p**? and the other 
does not. 

When c = 4 the considerations which we have employed above show that 


j 


G is generated by operators satisfying the relations: 


U SS}, U; U. — 
(3) SSo, Us U 181%8 52.72, 


The condition for permutability of V; and V,. is that the matrix 


—k, a, — am, ak, Gok, 

— |, Bil, — a+ Bik, Bek 
M = 

— — yl; 4- yeh 


be of rank 2. Again, if V, exists such that the rank of M is 2 then there 
exists such a V, with m, =O. In order that M be of rank 2 when m, =09 
it is necessary that y; —0 or else that /; 0. The latter possibility requires 
further that 

+ (Bi + + (Bry2 — Boy) = 


In order that a, and k,, rational and not both zero, exist and satisfy this 


relation it is necessary and sufficient that 
(4) (Bi y2)* + 


be a square, mod p. Since it is possible to select numbers 2, 81,° °° >%» 
with y, +0 so that this condition is not satisfied it follows that there exist 
groups G with c=4 which contain no abelian subgroup of order p°*. When 
y: = 0, then (4) is a square, and therefore, that (4) be a square, is a neces- 
sary as well as a sufficient condition for G to contain an abelian subgroup of 
order p°*. 

Two groups which have k = 3, c= 4, and which contain no abelian 
subgroup of order p**? are simply isomorphic. One such group is generated 
by operators satisfying (3) with Be=1, 
where r is a particular not-square. New generators U’;, U’,, U’, may be 
selected in U so that 


( 
1 
a 
tl 
k 
n 
Is 
0 
( 
G 
0 0 — 
a 
if 

l 
\ 
l) 
\ 
0 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES. 371 


, , 
C1. = 


(5) = 8’, 29 


where ¢’;; is the commutator of U’; and U’;, s’; is the commutator of U’; 
and s, and 4, B:,° * *,y2 are arbitrary except for the condition that (4) be 
not a square. Let 

U", U;, 

U’, = U,4U,4U 

= 


If the commutators are obtained and conditions derived that they satisfy (5), 
there are obtained six linear non-homogeneous congruences in the six unknowns 
+, The ranks of the matrix of coefficients and the augmented 
matrix are the same provided Biy, — Bey: ~ 9, which is true whenever (4) 
is not a square. ‘This completes the proof of the statement at the beginning 
of the paragraph. 

For any other group whose generators satisfy (3) we may suppose that 
(4) is a square. Then there exists a set a,,/,,0,0 which reduces the rank 
cf M to at most 2. If for a particular set the rank of M becomes 1, then the 
corresponding system of congruences has three linearly independent solutions. 
These solutions define V, itself and two others which we denote by V2 and V3. 
Then every element of {V2, V;} is permutable with V,. Hence G@ contains 
at least p-- 1 abelian subgroups of order p*'*. Two such groups are simply 
isomorphic since V,, V2, V,; may be taken as generators. They satisfy (3) 
with = B, =: Conditions that the rank of may be made 
l are: y2 Be = = 9, since k, implies a, = 0. 

For the remaining groups we may suppose that M becomes of rank 2 for 
—=m,=0 and a, and k, satisfying the quadratic which precedes (4). 
Writing this quadratic in the form 

(a, k,) (a, —— = 9, 
we note that a,/k, =A, or Az and unless A, = Az there exist two independent 
sets a,,h,,0,0 and a’,, k’;,0,0 each of which reduces the rank of MW to 2. 
When A, A, the system of congruences for the determination of dz, kz, 12, me 
hecomes 
dz + Aik, + al. + a.m, = 0, 
(Ai — B) le — Bom. = 0, 
+ (Ai — yz) me = 0,7 


where the last two are dependent. The following sets reduce .V/ to a matrix 
of rank 2: 

As, 1,050 at's, — Ba, + Bi- 


i 


372 H. R. BRAHANA. 


The sets in the same row determine permutable operators V,, V2 and V’,, V’., 
If B2540 and A: Az, then these four operators are independent and they 
generate G. G then contains two abelian subgroups of order p**? and is a 
subgroup of the holomorph of {C, V,, V2}. Such a group is completely deter- 
mined by the above mentioned properties.® It is not simply isomorphic with 
any of the groups determined previously. The assumption that B. 0 is not 
essential, for if B20, we solve the first and third congruences and obtain 
the same result so far as abelian subgroups of order p**? are concerned and 
these abelian subgroups determine G. 

If A; =A, then considerations similar to those used above show that @ 
contains but one abelian subgroup of order p***. Under the given conditions 
cn the exponents new generators U’,, U’2, U’; in U may be found 
such that and This shows the 
existence and the uniqueness of the group with one abelian subgroup of 
order p°*?. 

Hence for k= 3 and there are four groups having respectively 
0,1,2, and p+ 1 abelian subgroups of order p*t?; any two groups with the 
same number of abelian subgroups of order p**? are simply isomorphic. 

For c = 3 generators of @ satisfy the relations: 


U,*e, — — U 
(6) = 882, = U 18, 
= 883, 0,702.0, = 


Conditions for the permutability of V, and V2 require the rank of 


—k, a,— al,— a.m, ak, —a3m, +- asl, 
M=|{—1, — Bil, — Bom, a -+- Bik, — + 


to be at most 2 for a proper choice of V,. For certain groups G@, in other 
words for certain sets «, 8, y, it is possible to choose V,; so that M has rank 1. 
In such a case V, is permutable with V2 and Vz and V;, V2, Vz are inde- 
pendent. Since H is maximal abelian, s is not contained in {Vi, V2, Vs); 
each operator of which reduces the rank of M to 2 or 1. Hence @ is generated 
by s, Vi, V2, and V;. We may take U; to be Vi and assume G to be gen- 
erated by operators which satisfy (6) with =f; =y1 = % = B2=y2 =": 
If in addition a; = Bs = yz = 0, then and also reduce the rank of M 
to 1. This group is generated by the two abelian subgroups H/ and U, and is 


°Cf. “On the metabelian groups which contain a given group H as a maximal 
invariant abelian subgroup,” American Journal of Mathematics, vol. 56 (1934), Pp. 510. 
This is the first group in the table. 


sul 


| 

7 

( 

ii 

0 

a 

by 

pe 

it 

if 

of 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES., 373 


therefore a subgroup of the holomorph of H. It is completely determined by 
its order and the order of its commutator subgroup.® It contains p? + p+1 
abelian subgroups of order p*** corresponding to the same number of subgroups 
of order p? in U. 

If a3, Bs, ys are not all zero, we note that every element of {U2, U3} 
reduces the rank of M to 2, since every such element is permutable with U;. 
( therefore contains at least p+ 1 abelian subgroups of order p*t?, one for 
each subgroup of order p in {U2,U;}. It will be convenient to write M in 
the more special form 


M’ =| — 0 Bal; 
—m, 0 + yal; 


The choice k,,1,, m, = 0, %3, Bs, ys reduces the rank of M’ to 1. The 
corresponding operator V,; is permutable with U, since every operator of U 
is permutable with U,; it is permutable with V. determined by £3, 0, 0, 1 
which is not in U if 8; 0; and it is permutable with V; determined by 
—y3,9,1,0 which is not in U if y,40. If neither B; nor yz is zero, the 
operator V; is in the group {U,, Vi, V2} since 


0, Gs, Bs, ¥3 


Bs, 0, 0, 1 
0, 0 
1 & 0 


are linearly dependent. If not both B; and yz are zero, then G@ is generated 
by Ui, Uz, Vi, and V2 or Vs. The pairs Ui, Uz and Vi, V2 (or Vi, Vs) are 
permutable. Hence if not both 3 and yz are zero @ is generated by two of 
its ahelian subgroups and hence is a subgroup of the holomorph of {C, U,, U2}. 
This group is generated by operators which satisfy the relations 


= S254. 


The group is unique,’ it contains 2p + 1 abelian subgroups of order p**’. 

If B, = = 0 and a, 0, then the rank of M’ is 3 unless a, = 0; and 
if 4; =0 the rank is 2 unless 1; = m,—0. Therefore U, is the only operator 
of G which reduces the rank of M’ to 1. The only abelian subgroups of order 
v* of G correspond to subgroups of order p? of U which contain U,. Hence 


*Cf. the preceding reference, p. 495, Theorem 5. 1. 
"CE. loc, cit., p. 510. This is the group of order p"** with K of order p® and one 
subgroup of type 1. 


| 
| 
= 


374 H. R. BRAHANA, 


G contains p+ 1 such subgroups which distinguishes G from the groups 
which have been discussed. These p+ 1 abelian subgroups of order p**? are 
all contained in a non-abelian subgroup of order p***, generated by C, U,, U., 
and U;, which will distinguish it from another group containing p+1 
abelian subgroups of order p**? which will follow. In the present case an 
obvious change of generators will make a, == 1, which shows that any two 
such groups are simply isomorphic. G is not in the holomorph of any of its 
abelian subgroups since all of its abelian subgroups are in H or in {0,0} 
and none of the latter contains both U. and U3. 

For none of the remaining groups with c=3 can, V, be selected to make 
the rank of M smaller than 2. We consider those for which the rank of V 
may be reduced to 2. For these we may assume a, = BB; =y, =0. Then 
both U, and Uz reduce the rank of M to 2. If there exists a V, not in {U,,U,} 
which also reduces the rank to 2, the corresponding V, will not be in {U,,U;} 
for otherwise V2 would reduce the rank to 1. Every operator of {V,, V.} will 
reduce the rank to 2. Hence there exists a V,; with m, =O and not in 
{U,, U2} which reduces the rank to 2. Under these conditions M takes the form: 


ky ay 0) Gok, +- a1, 
mM” 0 ay Bok, + > 
—0 0 0 ay, + + 


Now V,, being not in {U,, U.}, must have a,40. Such a V, exists only if 
not both y2 and yz are zero. Hence if y2=y3 =0, @ contains but one abelian 
subgroup of order p***. The condition that no operator of {U,, U2} reduces 
the rank to 1 requires that (#8; — 82) be different from zero which implies 
that (8; — #2)? + 4a,82 be not a square. The existence of such a group is 
obvious ; we omit for the moment consideration of the question of uniqueness. 
If not both y2 and y; are zero, we may suppose that y20. Then — 2, 1,9, 
determines an operator V, which reduces the rank of M” to 2. In this case 
G contains at least two abelian subgroups of order p**?, and since the rank of 
M cannot be made 1 the corresponding V2 is not in {U,,U2, Vi}. Hence 
such a group, if it exists, is generated by the two abelian subgroups {C, U;, U2} 
and {C, V;, V2}. It is therefore a subgroup of the holomorph of either. It is 
identified as the group* with commutator subgroup of order p® and no sub- 
group of Type 1. The existence is established by showing that the group 
described in the paper referred to contains a maximal abelian subgroup of 
order p*. It contains p+ 1 abelian subgroups of order p**?. Since it is 
generated by two of these abelian subgroups it is distinguished from the group 


§ Cf. loc. cit. 


| 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES. 375 


with p+-1 abelian subgroups of order p**? all contained in a subgroup of 
order p°*’. 

We shall now see that the next to the last group is uniquely determined 
by the property of having one abelian subgroup of order p**?. This group is 
obviously not in the holomorph of any of its abelian subgroups, since it con- 
tains no abelian subgroup of order p*** and but one of order p*?. We then 
fix attention on the characteristic subgroup {C, U,, U2} which we denote by 
H’. With this change of notation 


we have the following relations satisfied : 


The commutator subgroup arising from transformation of H’ by {U’,, U's} 
is of order p®. U’, and U’, determine two permutable operators in the group 
of isomorphisms of //’ which with H’ give the particular subgroup ® of the 
holomorph of /7’ which has no subgroup of Type 1. A choice of generators 
to give the canonical form of generating relations of this subgroup of the 
holomorph of //’ gives a canonical form for generating relations of G. This 
form is the above set with a =~; —0, B. =—1, a; =—r, where r is a 
particular not-square. The possibility of doing this is a consequence of the 
fact that (8, —a.)* + 4,8. is not a square. The canonical form contains 
no arbitrary constants and hence the group is uniquely defined by the fact 
that it contains just one abelian subgroup of order p**. 

For arbitrary , 8, y there exists V, such that the rank of M reduces to 2, 
when the corresponding 4, k;,1,,m, are substituted. A proof of this will 
establish the fact that every group G with c =3 contains at least one abelian 


subgroup of order p*°*®, and hence is one of the five groups determined above. 

If there exists a V,; which reduces the rank of M to 2, there exists one 
with m, 0. Suppose that a, 0 also for this V,. The condition that such 
a V, exist is that '(k,,1,) =0, where F is 


(Biy2 — Boy: + (Biys — Bsy1— + )kyl, + (A371 1,7. 


There exist quantities 2, B,y such that this congruence is irreducible; we may 
suppose that such is the case, for otherwise we have the existence of the required 
V,. Hence we may assume that 4,540. When a, ~ 0, the first column of M 
is expressible linearly in terms of the last three columns, and hence the rank 


* Cf. loc. cit., p. 510 and p. 500. 


29 
| 
n 
| 
| 
\ 
J 
J 
ll 
nh 
if 
n 
() 
10 
is 
p 
f 
Is 


376 H. R. BRAHANA. 


of M is the same as the rank of the matrix composed of the last three columis, 
The determinant of this matrix, with m, = 0, is 


+ {(Bi + y2)hi + (ys — + ]. 


The quadratic factor can always be made zero by a proper choice of a1, ky, l,, 
and since F'(k,,1,) has no linear factors a, will not be zero. This completes 
the proof. 

We give below a table of the groups fork <4. For given k and c the 
different groups are arranged in the order of their appearance in this paper. 


k c number of abelian subgroups 
of order p**?. 


0 
0 
1 
0 
1, 0 
0,p+1,2,1 


* These two groups are distinguished by the fact that the first contains an 
abelian subgroup of order p°** and the other does not. 


1 


We 


3. A geometric description of the groups for k—3. It has been 
convenient in the preceding pages to single out the maximal abelian subgroup 
H, and consequently to distinguish between s and the other generators. In 
the sequel we shall drop this distinction and consider the same groups gen- 
erated by the operators U,, U2,U;, and Uy. Every such group is defined by 
a set of relations on the U,’s. Each set contains six relations which define 
commutators of pairs of U;’s. If there are no more relations, aside from those 
expressing permutability of the commutators, that is, if the six commutators 
are independent, then the group G is the master group described in Section 1. 
Any other group generated by four U;’s is defined by additional relations 
among these six commutators and hence is a quotient group of this group of 
order p'° with respect to some subgroup of the commutator subgroup. The 
existence of two kinds of group with c =5 shows that the commutator sub- 
group of G, of order p*, contains two kinds of cyclic subgroup. Distinguishing 
properties of these two groups with c= 5 are that one contains an abelian 
subgroup of order p**? and the other does not. G itself contains no abelian 
subgroup of order p°** and consequently if the group with c = 5 contains 
such a subgroup the process of taking the quotient group introduces permuta- 
bility among operators of the form U,"U.%U,%U ,% where none existed in 4. 


| 

| 
| 

| 

| 

| 

| 


ly 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES. 3877 


This means that the cyclic group which is set equal to identity contains com- 
mutators. In the other case for c= 5 the cyclic group contains no commutator 
except identity. We therefore examine the question of commutators and 
non-commutators in the commutator subgroup of the master group G. 
The group G@ is defined by the relations: 
U;7U,U; = (i<j, j=2, 3,4), 
= Txt" sj, (1, 7, == 1, 2, 3, 4). 


If the 7;;’s are ordered then every operator of the commutator subgroup of @ 
is determined by the set of exponents of the 7;;’s in the expression for it; two 
operators belong to the same cyclic group if and only if their sets of exponents 
are linearly dependent. The set (0,0,---,0) corresponds to identity. Hence, 
if the sets of exponents are taken to be the codrdinates of points in a finite 
projective space 2 of 5 dimensions, every point in FR will correspond to a cyclic 
subgroup of the commutator subgroup of G. We determine the condition that 
the point = (1, %@) which corresponds to 113% 114% To3 
represent a cyclic subgroup which contains a commutator.’® If a represents 
acommutator then there exist two operators 


which have the element corresponding to « for their commutator. 


This is exactly the problem of the Pliicker line coérdinates in a projective 
three-space. We then have the following theorem: 


A point a in the space R corresponds to a commutator if and only tf it 
lies on the four-dimensional spread S defined by 


— + = 0. 


A point in the space R corresponds to a cyclic subgroup in the commu- 
tator subgroup K of G; a line in R, being the set of points linearly dependent 
on two points, corresponds to a subgroup of order p? in K; and a plane corre- 
sponds to a subgroup of order p* in K. The effect of taking the elements of 
a particular subgroup of K to be identity, so far as abelian subgroups of order 
yp? of the resulting quotient group are concerned, depends on the relation of 
the corresponding point, line, or plane to the quadratic spread S. If the point, 
line, or plane has a point in common with S, the resulting quotient group will 

Since V."V,"V, = V,*r*, if r is the commutator of V, and V,, then every element 


of a cyclic group is a commutator or else none (except identity) is a commutator. 


10 


| 

| 

| 

| 


378 H. R. BRAHANA., 


contain two permutable elements V; and V., which with C give an abelian 
subgroup of order 

The points of R constitute two classes with respect to S, a point being 
either on S or not on S. If the point is not on § the corresponding quotient 
group contains no abelian subgroup of order p**?. The fact that there is but 
one such group for ¢=45 means that all points not on S are alike. The 
canonical form obtained for generating relations of this group in Section 2 
has the commutator of U, and U2 the same as that of U; and Uy. Thus by 
putting equal to identity the elements of a cyclic group corresponding to a 
point not on S, a group of order p* determined by two points on 8 is reduced 
to a cyclic group of order p. Hence every point of R not on S is on a line 
joining two points of S. 

A line in R may have no points on 8; it may have one point on S; it may 
have two points on 8; or it may lie wholly on S. These possibilities corre- 
spond to the groups with ¢c—4 with 0,1,2,p-+1 abelian subgroups of 
order p°*?. 

A plane in # has at least one point in common with s. It may cut § in 
one point, in a line, in a proper conic, in two lines, or it may lie wholly in 8. 
These possibilities correspond to the groups with c = 3 and 1,p +1, p+1, 
2p +1, and + p+ 1 abelian subgroups of order p°*? respectively. 

Each of the last three groups is a subgroup of the holomorph of one of 
its abelian subgroups. A geometric criterion for this possibility involves a 
consideration of the three-space (2, @»,2%3,%,). Let us consider the case 
= 3. A particular group is determined by a plane in R& which has a certain 


set of points on S. Each point on S determines a line in XY. Each line in Y 
is determined by two of its points. Two skew lines in XY will be determined 
by four points in terms of which every point of X can be expressed linearly. 
If two lines in X are not skew, then the line joining the corresponding points 
in F lies wholly on 8S. Thus in the case of the third and fourth groups 
above, where the plane cuts S in a proper conic and in a conic consisting of a 
pair of lines, it is possible to select two points on the intersection which 
represent skew lines in XY. In those two cases the groups are subgroups of the 
holomorph of the abelian group of order p***. For the first two cases it is not 
possible to make such a selection. A plane wholly on S determines a set of 
lines in X which are also on a plane. The corresponding points on .\ repre- 
sent an abelian group of order p*** in G. Any metabelian group with operators 
all of order p which contains an abelian subgroup of index p is in the holo- 


11 Cf. for example, Veblen and Young, Projective Geometry, vol. I (1910), P- 329, 


Theorem 30. 


( 
nN 
i 
t¢ 
ti 
q 
tl 
0 
| a 
0 


FINITE METABELIAN GROUPS AND PLUCKER LINE-COORDINATES. 379 


morph of that abelian subgroup. For c = 4 the group is determined by a line 
ink. The only one of these groups which is in the holomorph of one of its 
abelian subgroups corresponds to a line in FR which has two and only two 
points on S. In the case where the line is wholly on S each of its points 
determines a line in XY but all-of these lines in Y are in a plane. Since not 
every line in this plane in X is represented by a point on the intersection of S 
with the given line in #, this plane in Y corresponds to a non-abelian subgroup 
of order p*** in the given group. 

In general, for an arbitrary number k& of generators, the condition that 
a group G’ be in the holomorph of one of its abelian subgroups may be stated 
in geometric form. If G’ is in the holomorph of one of its abelian subgroups, 
then the points of the space X can be expressed as linear combinations of 
points of two of its subspaces each of which determines an abelian subgroup 
of G@’. An abelian subgroup of order p*™ in G’ determines a (k, — 1)-space 
in X all of whose lines determine points on the intersection of S with the 
m-space in R which determines G’ as a quotient group of the master group G. 

The geometric aspect of the solution of the problem of classification of 
metabelian groups with & independent generators and elements all of order p 
isnow clear. It involves the extension of the theory of Pliicker line coordinates 
toa space of k — 1 dimensions. These codrdinates determine a space R of y—1 
dimensions, y= k(k—-1)/2. In R the points which correspond to commu- 
tators are on a subspace S of 24-— 4 dimensions defined by (k — 2) (k — 3) /2 
quadratic congruences, the conditions that a point of 2 represent a line of X. 
It then involves the determination of the relations of points, lines, planes, 
three-spaces, etc. in R to the subspace 8. These relations determine the possi- 
ble types of quotient groups of G. In determining the types of relations to S 
of flat m-spaces in FP it is necessary to separate the m-spaces of F into classes, 
all the members of a class being conjugate under the group of collineations 
of R which leave S invariant.’* This group is closely connected with the 
group of collineations in .Y. Of course the transformations are “ rational ” 


end the geometry is finite. 


“It is this part of the problem that made necessary most of detail of section 2. 


an 
| 
ing 
ant 
yut 
“he 
12 
by 
ya 
ine 
Te- 
of 
in 
S. 
of | 
ga 
ase 
ain 
¥ 4 
ned 
rly. 
nts 
ups 
fa 
ich 
the 
not 
of | 
re- 
ors 
lo- 
329, 
ig 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES.* 
By R. W. WaGNeER. 


The analytic functions of a complex variable have many interesting 
properties. The property of analytic continuation is made the basis for this 
extension of such functions to matrices. The extended function is then a 
mapping of a subset of the matrix space on to another subset of the same 
space. The procedure followed here is to replace the complex variable in a 
power series by a variable matrix, show that the resulting matrix function can 
be reduced to the original function of several complex variables, and apply 
the process of analytic continuation to each variable. The most interesting 
results of this paper concern the singularities of the extended function which 
are introduced by the extension. The last part of the paper shows how this 
approach may be applied to the solution of certain matrix equations. 

The notational scheme is as follows: Capital script letters indicate 
matrices, small letters indicate ordinary complex numbers, and subscripts are 


used for enumerative purposes. 


1, Let 9 denote the matrix space, the space of all square matrices of n 
rows whose elements are complex numbers. One can make this a metric space 
by defining the absolute value of a matrix and then defining the distance from 
X to Y to be the absolute value of X — Y. The absolute value of Y will be 
taken to be? 

|X| = Vtr XP’ =D 
inj 


A similarity transformation applied to 9% is just a change of codrdinates 
in %. Unfortunately, such a transformation does not leave the distance 
invariant, but it is a homeomorphic transformation of 9%. Therefore, limiting 
relations will be independent of the codrdinate system, and it will be per- 
missible to use the most convenient codrdinate system for investigating these 
limits. 

It is convenient to distinguish several subsets of 9n for future reference. 


I. @&(z2x), the set of matrices which have x for a characteristic root. 


* Received November 28, 1938; Revised August 19, 1939. 
Compare with Wedderburn, [1], page 125. The numbers in square brackets refer 


to the bibliography. 


380 


q | 
| 
i 
t 
( 
n 
( 
h 
| 


381 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES. 


® (x) has dimensionality two less than 9m. For it is defined by the complex 


equation 
det (1 — 2) =0. 


II. , the set of matrices whose characteristic equation has distinct 
and simple roots. 

III. QD, the set of matrices whose reduced characteristic equation is of 
lower degree than the characteristic equation. 

IV. §%, the complement of D in NM—N. 

The set D + Y is also (2n? —2)-dimensional. For the codrdinates of 
the matrices which belong to it satisfy the complex equation obtained by 
setting the discriminant of the characteristic equation equal to zero. These 
equations define an algebraic locus, so that the following topological theorem 
is valid. 


THEOREM 1.1. Any matrix of & or D can be approached by matrices 
which belong to N. 

2. Corresponding to each elementary divisor of X is a pair of matrices 
called partial idem-potent and nil-potent elements of Y. If X does not belong 
to D, these matrices are uniquely defined. If X belongs to D, they can be 
found, but not uniquely. In case they are unique, they are called principal 
idem-potent and nil-potent elements of X.? 

If X belongs to M and has the characteristic equation 


g(x) = (@— Az) (U—An) = 0, 
then the idem-potent elements of .V are the matrices 
(2.1) = (X—Ay) (XM -— (KX (HK —An)- 
The nil-potent elements are all zero in this case. 
But in any case the partial idem-potent elements, Pi, and the partial 

nil-potent elements, (i, satisfy the equations 

(2.2) PiQj = = 8i5Qi 


In addition, the important identity, 
2 
(2.3) X + Qi) 
i=1 
is also true, » being the number of elementary divisors. 


*See [1], pages 27-29, 42. Also [2]. 


382 R. W. WAGNER, 


Let f(x) be an analytic function of the complex variable z. It is assumed 
that everything is known about the function f(z), so that the only problem is 
to extend the function to Mm. Assume that the origin is a regular point of 
f(z) and that it is represented in a neighborhood of the origin, | «| <c, 
by the power series 


= 


k=0 
The expression 


(2. 4) 


k=0 
defines a mapping of the part of 9% for which the right member converges on 
to a subset of 9. Therefore it is considered as a part of the extended function. 
THEOREM 2.1. (A) ts an absolutely continuous function of X on 
|X| <e. 
The proof of this theorem is exactly the same as that for the corre- 


sponding theorem in function theory. Replacing \ by its absolute value 
reduces 2.4 to a power series in | X |. 


THEOREM 2, 2. 


X | <c, then it is true that 


> n 1 


The first step of the proof is to show that each characteristic root of YX is, 
in absolute value, less than | XY |. Y transforms the unit sphere in a vector 
space of n dimensions into an ellipsoid in this space. The absolute value 
which has been chosen is the square root of the sum of the squares of the 
principal semi-axes of this ellipsoid. Corresponding to each characteristic 
root, Ax, there is a vector, v%, such that \1,—=Axvx. Therefore | Ax | is less 
than the semi-major axis of this ellipsoid, and thus | Ax | S | X |. 

If XY is such that |X | <c’ <c, for any « > 0 there exists an m (which 
we take greater than n) such that for |A| <c¢ 


and 

(2.6) | o(X) | <e 


From (2.2) and (2.3) one gets 


p=9 k=1 o=0 


i 
| 
| 
f | 
4 di 
id 
al 
fu 
f( 
pr 
| ap 
Al 
val 
ij 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES. 383 


But Qx is nil-potent. Therefore, powers of Q; higher than n may be omitted. 
Thus, by combining (2.5) and (2.7), one gets 


4 n v 1 
(Ae) Pau? |< (n+ 1)e 
p=0 g=0 k=1 


Combining this with (2.6) leads to 


| < (n+ 2¢ 


k=1 


Since « can be taken arbitrarily small, the theorem is proved. 

The statement of this theorem is an identity on the region | X | Ce 
The analytic continuation in % is based on this identity. The region on 
which f(X) is defined is extended by assuming that a similar relation is valid 


in MM. 


DeFINIvTion. Jf X has the partial idem-potent and nil-potent elements 
P, and Qx associated with the roots Ax, the matrices of the form 


v n 1 
k=1 o=0 


will be considered as values of the funclion,—images of NX. Moreover, if Y 
approaches \’, any limit of f(Y) 1s to be admitted as a value of f(X). 


The above definition reduces the matrix function to a single function of 
n independent variables on M. But the matrix Y can be changed in two 
distinct ways. One can change the characteristic roots, or he can change the 
idem-potent elements, P. The equation (2.8) shows that, on Nl, the Px 


are not changed in passing from the argument to the value. 


Tueorem 2.3. Jf X belongs to N, and if each rx is a regular point of 
{(t), all values of f(X) are given by (2.8). 


If X belongs to N, the 1’ are given by (2.1). The Ax are continuous 
functions of the codrdinates, so that the same applies to the ?,. By hypothesis, 
f(z) is continuous in the neighborhood of each A. Therefore no limiting 
process can produce limits not of the form of (2.8). 


2.4. Jf W is non-singular, = W*f(X)W. 


Proof. It is easy to verify that the similarity transformation can be 
applied to the power series (2.4) and to the matrices of the form (2.8). 
All values are obtained from these, or by applying limiting processes to such 
values, The similarity transformation is continuous. Hence the theorem is 
established. 


384 R. W. WAGNER. 


THEOREM 2.5. If X has an elementary divisor of degree m associated 
with » and if » is such that the first, second,- ++ (r—1)-th derivatives of 
f(x) vanish for =A, but f(A) AO, then f(X) will have s elementary 
divisors associated with f(A) where s is the smaller of r and m. These 
elementary divisors will be of degree [m/r] or [m/r] +1. 


The proof of this theorem depends upon the identity (2.8) and the 
consideration of the rank of powers of the Q associated with this elementary 
divisor in \. The details are omitted. It was stated above that on TN the 
elementary divisors were changed only in the root associated with them in 
passing from, the argument to the variable. This theorem states that on § 
an elementary divisor of the argument may be broken, up by passing to the 
value of the function. Other changes will appear later. 


3. This section is devoted to a discussion of the singularities of the 
extended function. It will appear that the extended function reflects the 
singularities inherent in the function and that the extension introduces some 
singularities if the original function is multiple-valued. 

If f(X) is single-valued in a neighborhood of XY = A, but discontinuous 
at A, A is called a singular point. A is called a pole of the function if the 
limit of f(X) is not finite no matter how X approaches A. 

If f(X) is multiple-valued, the point A will be called a branch point of 
f(X) if the number of values of the function is different for the point A and 
for points in every neighborhood of A. 


THEOREM 3.1. The matrices of Y are singular points of f(.\) if, and 
only if, f(x) is multiple-valued. These points ara poles of some branches of 
f(x). 

In view of Theorem 2. 4, it is permissible to give the proof in the most 
convenient codrdinate system. Another simplification is accomplished by using 
matrices with only the essential parts appearing. Since the elementary 
divisors enter into the function independently, additional elementary divisors 
may be added later. 

Let Y =Al + J, where A is a regular point of f(x) and J is the matrix 
all of whose elements are zero except those in the diagonal above the mail 
one, which have the value one. Let X[h] denote the matrix 


A 1 0 0 0 1 
A+tA 1 0 0 
0 0 0 0 


Je 
| 


X[h] =| 


0 0 0 1 | 
0 0 0 A+ (n—1)h | 


| 

i 

| 

| 

| 


and 


ind 


10st 
ing 
ary 
sors 


trix 


ain | 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES. 385 


The plan is to compute f(Xi[h]), its limit f(X[0]), f(Y) and to — 
the last two matrices. 

Note that X[h] is a matrix in N. Therefore its principal idem-potent 
elements are given by (2.1). Making this computation, one finds that the 
elements of P; associated with the root A + (k—1)h of X[h] are either 0 or 


1 
Putting these into (2.8) one gets 


where $j, denote the various branches of f(z) in the neighborhood of A. The 
above can also be written 


In case all the ¢;, are the same, 
difference of #; with respect to the increment h. Because the limit of the 
ratio of the r-th difference of a function to the r-th power of the interval is 
the r-th derivative, one gets in this case 


1 


(3. 1) f[X(0)] = 


When f(a) is single-valued, this reduction is possible. 
But in case the ¢;, are not all the same, one of the elements of f(X[h]) 
with s==r-+ 1, namely 


+ th) — $5,(A + (r-—1)h)], 


has different values of f(a) in the numerator. Hence, in this case, one gets 


f(X[0]) = « 


To complete the proof of the theorem, it is necessary to show that similar 
limits are obtained by using other paths of approach. It was assumed that A 
is a regular point of f(a). Therefore, changing the manner in which the 
roots of Y[h] become equal can have no effect on the limits as long as the 
path lies in . The path can also be deformed in this way: let V[h] be a 
hon-singular-matrix valued function of h which is continuous for 0h < 1. 
Then VXV- is a continuous function of h. In order for this matrix to 
approach Y it is sufficient (and necessary) that V[O]Y¥Y = Y V[0]. But, by 
Theorem 2.4, one has f(VXV") =Vf(X)V™ for h different from zero. 


ed 
of 
ry 
086 
he 
he 
in 

| 

he 

he 

me | 

ous 

of 


386 R. W. WAGNER. 


Also, a similarity transformation is continuous. Therefore, the limit of 
f(VXV*) =V[O]f(X[0])V[O]*. In case f(X[0]) is finite, equation (3.1) 
shows that there is a polynomial such that p(Y) =f(X[0]). Therefore, in 
this case V[0] commutes with f(X[0]), and the limit exists independent of 
the path in %. On the other hand, if f([0]) is not finite, the similarity 
transformation cannot change this property. Therefore since any path in N 
can be obtained from the original path by a combination of the above dis- 
tortions, all paths in 7 lead to the same limit. 

It can be proved by induction that paths in & lead to values of the type 
(2.8) for arguments: in 9 also. The first step is the above proof for two 
roots becoming equal. The inductive step can be carried out by using an 
approximation involving matrices in N. Let Z» be a sequence of matrices 
with elementary divisors of degrees less than n. Choose Xm so that f(Xm) 
approximates f(7,,) within an amount e/m. Then the limit of f(Xm) is the 
same as the limit of f(7,). Using the above result concerning the limit of 


f(-m), one arrives at the theorem. 


THEOREM 3.2. The points of D are singular points of f(X) if, and 
only if, f(x) is multiple-valued. The singularily is of this nature: if X 
approaches a point of D along some path the limit of f(X) exists but depends 
upon the path. 

As before, let XY[h] be a point in N + §% and let its limit, [0], be a 
point of D. Let $; denote various branches of f(a) in the neighborhood of 
A,. Then a value of f(X) can be written in the form 


(3.2) bh) Pet Pe 
Let X[0] have the form 
k=1 


h=re+1 


Then, on applying (2.8), a value of f(X[0]) has the form 
(3. 3) Py + f (Ax) Pee 


Note that, if the ¢j;, are not identical, the limit of f(X[h]) is not the ex- 
pression in (3.3). However, if the ¢;, are the same (necessarily true for @ 
single-valued function), the limit of f(X[h]) is given by (3.3). Thus the 
points of D are singular points of the matrix function. 

Now let U be a matrix which commutes with [0] but not with the 
individual P;, (k =1,2,---+,1r). Moreover, U can be chosen so that it will 


| 
r 
| 
} 
§] 
| 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES. 387 


commute with a linear combination of these P; only if the coefficients are all 
the same number. A similarity transformation is continuous. Therefore 


one gets 
lim f(UX[h]U*) = U lim f(X[h]) U7. 
h-0 


In case the $j, are not all the same, the coefficients of the P; will not all be 
the same number, and U cannot commute with the limit of f(X[h]). But 
notice that and UXU~ approach X[0] along different paths. Therefore 
the theorem is established. 


CoroLuary. If f(x) is multiple-valued, and if X is a point of D, the 
matrices admitted as values of f(X) are the transforms of the values of form 
(2.8) by the group of non-singular matrices commutative with X. 


THEOREM 3.3. Unless X belongs to D, all values of f(X) are of the 
form (2.8). 

This result is a combination of Theorem 2. 3, the proof of Theorem 3. 1, 
and the definition of the sets 2, 9, and D. The corollary describes the 
situation otherwise. 

The above discussion concerns the singularities which arise from the 
extension of functions to matrices. The following theorems concern the 


inherent singularities of the function. 


THEOREM 3.4. Jf «=aisa singular point of f(x), the points of Oe (a) 
are singular points of f(X). 

This theorem is important because it states that the point singularity of 
j(z) is exploded into a (2n?— 2)-dimensional singularity for f(X). The 
1” a singularity is preserved. 


possibility of carrying the variable “ aroun: 
The values obtained by a limiting process applied to (3. 2) can also be obtained 
by carrying a value (3.3)! around the proper branch locus of f(X), keeping 


the argument in D. 


THEOREM 3.5. If X has an elementary divisor of degree r associated 
with 4, and if f® (A) = « for some s less than r, then f(X) = @. 


This theorem is proved by substituting into (2.8), and then applying 
Theorem 3.3 and the corollary. This theorem can be applied to show why a 
nil-potent matrix with two rows has no square root. Such a matrix is a 
singular point of the function. 


4. Let F(X) be defined to be any matrix which satisfies the equation 


(4.1) p(F(X)) = P(X) =X. 


-— 


388 R. W. WAGNER. 


In this section, the function F(X) defined here will be compared with the 
corresponding function f(X) defined in Section two. It will appear that 
f(X) is identical with the primitive solutions of (4.1). The primitive 
solutions are those solutions of (4.1) which are not solutions of both (4.1) 
and a lower degree equation. In making this comparison, certain results of 
Roth [3] and of Franklin [4] concerning the function F(X) will be used. 


THeEorEM 4.1. If f(x) ts defined by the equation 
p(f(t)) ==, 
then the matrices f(X) defined in Seclion two are solutions of 
(4.2) p(f(X)) =X. 


Proof. The values of the form (2.8) satisfy (4.2). Also the values of 
the form of the corollary of Theorem 3.2 satisfy (4.2). All values of f(X) 
are of one of these types or limits of these types. The operations of addition 
and multiplication are continuous. Therefore, any limit of matrices which 
satisfy (4.2) will also satisfy (4. 2). 

In order to prove the converse relationship, it will be convenient to trans- 
late some of Franklin’s results into the language of this paper. The solutions, 
F(X), of (4.1) can be put into three classes : 


A. Solutions which are in turn polynomials in X. 

B. Solutions similar to one of form (2.8) by a matrix commutative 
with 

C. All others. These solutions must have elementary divisors whose 
degrees differ from the degrees of the divisors of -\. 


Roth showed that all solutions are of Type A unless XY belongs to 2. 
Franklin showed that, in general, solutions of Type B exist whenever X is 
derogatory. Furthermore, he showed that solutions of Type C exist only if 
X is a point of D and if XY has a root A, such that p’(f(A)) = 0, associated 
with several elementary divisors. 

The solutions of Type A are given by (2.8). Solutions of Type B are 
found in the corollary to Theorem 3.2. Hence it remains to show that the 
solutions of Type C can be obtained by a limiting process applied to values 
of the form of (2.8) or of the corollary. 

The condition p’(f(A) ) =0 is equivalent to the condition that f(A) = *: 
Therefore, solutions of Type C can exist only when X has a root which is ¢ 
branch point of f(x) associated with several elementary divisors. 

The branch point of the function 2’/” is typical of any branch point of 


14 | 
i 
; ig 
a h 
of 
W 
| 


AN EXTENSION OF ANALYTIC FUNCTIONS TO MATRICES. 389 


order m —1. Therefore, the branch point of this function will be investigated 
instead of the various branch points of the more general function. Let X[h] 
be the matrix 


| 0 0 0 
i| 
0 hl 0 0 


where J, is the identity matrix of r rows, s =7-+ 1, and J, is the matrix of 
r rows with all elements zero except for 1’s in the diagonal above the main one. 
Let Y[h] have the form 


0) O 0 
Y[h] =| 0 0 By 
O 0 0 0 
0 0 0 bl, 


where w is a primitive m-th root of unity, b is an m-th root of h, and #, and 
F, are matrices of the form 


jo 1 0 0 of 

E, — | 0 @ = 
|0 001 0 
| 000 1 


By a direct expansion it is possible to verify that the m-th power of Y[h] 
is X[h]. The blocks along the diagonal come out very simply, and the 
coefficients of the other blocks are sums of powers of w which are zero. Since 
Y[h] is a solution of the equation Y" —\, it is of Type B for all values of 
h different from zero. Moreover, the equation connecting X and Y will be 
valid in the limit. The rank of X[0] is n-—_m, but the rank of Y[0] is 
"—1. Therefore Y[0] is an m-th root: of Type C. Thus, by changing the 
equation slightly, a solution of Type C was obtained as a limit of solutions 
of Type B. 

X[h] is a matrix in D. So, for each h, there are continua of matrices 
Which satisfy the equation. One of them was selected and called Y[h]. Note 


390 R. W. WAGNER. 


that Y[h] is not in D. The limit of Y[h] is in Y. In taking this limit 
two things vary; both the continuum from which Y[h] was selected and the 
relative position of Y[h] in the continuum. 


THEOREM 4.2. If A is a branch point of f(x) of order m—1, f(A) 
finite, and if X has m elementary divisors associated with » (k of them of 
degree r +1 and the rest of degree r), then there exists a value of f(X) in 
which f(r) is associated with an elementary divisor of degree mr + k. 


The above theorem states that some of the solutions of Type C can be 
obtained as values of the matrix function f(X). But the only solutions so 
obtained are those which utilize the complete symmetry of all the values of 
f(x) which merge at the branch point. Hence the above process can lead 
only to primitive solutions. However, the non-primitive solutions are primi- 
tive solutions of a lower degree equation. Therefore, if the above process 
yields all primitive solutions, it can be used to get all solutions. 

Recall that X is a function, namely a polynomial, of f(X). From this 
view-point, Theorem 2.5 states that all primitive solutions will have the form 
specified in the hypothesis of Theorem 5.1. Hach continuum of values of 
{(\[h]) leads to a continuum of values of Type C for f([0]). The various 
primitive solutions of Type C differ only in the values of f(A) associated with 
the elementary divisors. But one can achieve this same result by properly 
choosing the continuum from which Y[h] is chosen. Therefore, any solution 
of Type C is a value of f(X) as defined in Section two. 


THeEorEM 4.3. The function f(X) is the same as the function defined 
as the primitive solutions of (4.1). 


THE UNIVERSITY OF WISCONSIN. 


BIBLIOGRAPHY. 


1. Wedderburn, Lectures on Matrices, American Mathematical Society Publication. 

2. Wegner, “Uber die Frobeniusschen Kovarianten,” Monatshefte fiir Mathematik 
und Physik, Bd. 40, pp. 201-208. 

3. Roth, “A solution of the matrix equation P(X) =A,” Transactions of the 
American Mathematical Society, vol. 30, pp. 579-596. 

4. Franklin, “ Algebraic matrix equations,” Journal fiir Mathematik und Physik, 
Bd. 10, pp. 289-314. 

5. Cipolla, “Sulle matrici espressioni analitiche di un’altra,” Rendiconti Circolo 
Matematico de Palermo, vol. 56, pp. 144-154. 


A 


1 
| 
a 
0 
t 
it 
4 
a 
ce 
q | C0 
fo 
le 
vi 
UCP 
i th 


it 


on. 
tik 


LINEAR DIFFERENTIAL INVARIANCE UNDER AN OPERATOR 
RELATED TO THE LAPLACE TRANSFORMATION.* ? 


By D. RAINVILLE. 


1. Introduction. The Laplace integral transformation * 


(1) = fet — f(s), 


is one which associates with each function F(t) of sufficient regularity another 
function f(s). Elementary known * properties of the operator & include 


(3) (t)} = (—1)* f(s), 


The Laplace transformation has important applications* to the solution 
of boundary value problems in ordinary and partial linear differential equa- 
tions. The operator ¥ often transforms one differential equation into another 
which is more readily solved, one which, indeed, may even be algebraic. The 
transformed equation may be of higher order, or otherwise more complicated, 
than the original. Finally, we see that many equations do not change form 
in any essential way when subjected to the operator &. One such equation is 


aa + UF = 0, 


* Received August 15, 1939. 

* Presented to the Society Nov. 26, 1938 under a slightly different title. 

* For an extensive treatment of this transformation see G. Doetsch, Theorie und 
Anwendung der Laplace-Transformation, Berlin, 1937. 

*Enzo Levi has shown that the case n =1 of equation (2) above, together with 
certain conditions on F(t) and its transform is sufficient to characterize the operator 
completely. For the precise result see his paper, “ Proprieta caratteristiche della tras- 
formazione di Laplace,’ Rend. Accad. Lincei, (6), vol. 24 (1936), pp. 422-426. 

‘See, for example, R. V. Churchill, “ The solution of linear boundary value prob- 
lems in physics by means of the Laplace transformation”: I, Mathematische Annalen, 
Vol. 114 (1937), pp. 591-613; II, Mathematische Annalen, vol. 115 (1938), pp. 720-739. 
See also his paper, “On the problem of temperatures in a non-homogeneous bar with 
discontinuous initial temperatures,” American Journal of Mathematics, vol. 61 (1939), 
PP. 651-664, in whick the Laplace transformation is used to establish a uniqueness 
theorem, 


) 
if 
n 
of 
t=0 
and 
is 
m 
of 
th 
ly 
ed 
(4) 
the 
sik, 
& 


392 EARL D. RAINVILLE. 


for which the transformed equation is 


t=0 
as may be seen from (2) and (3). 

Our fundamental problem is suggested by the fact that (4) is essentially 
invariant under &. In order to use only those properties of & which are 
concerned in that invariance we introduce another operator o and study o 
instead of &. 


DeFINITION 1. Let D=d/dz be the usual symbol for differentiation 
with respect to x; let D°==1. Then any polynomial in D and z will be called 
a linear differential operator of type P. 


DEFINITION 2. Let k,n, ks, ns; s=1,2,- +, be non-negative integers. 
We define ° o as a linear operator on linear differential operators of type P by 


(6) = (— | 
and 
(7) o| > | = > aso 


where the a, are any constants. 

It should be noted that by D*a" we mean that differential operator which, 
acting upon a function F, yields the k-th derivative of the product xP’. 

It is of some value to keep in mind one aspect of the nature of o. Con- 
sider a given function f(x), taken to be single valued for the present purpose. 
We may associate with f(z) an operator f which transforms each number 4 
of a certain set of numbers into another number f(z) in another, or the same, 
set of numbers. We call f an operator of class one. Next consider ). ‘The 
operator D transforms each function of a certain set of functions into another 
function of x. Further, if D operate on numbers, the result is trivial; i.e. 
D transforms every number into the same number, zero. Hence, we call D 
an operator of class two, noting that in a sense D must operate on operators 
of class one to give non-trivial results. Now consider o. This operator is 
defined above in such a way that it transforms each linear differential operator 
into another linear differential operator. Essentially o needs to operate on 
operators of class two to give non-trivial results. We call o an operator of 
class three. 

An adjoint operator may be defined such that it changes a linear dif 


5 Essentially this definition is to be found in S. Pincherle and U. Amaldi, Oper 
zioni distributive, Bologna, 1901, p. 361. 


a 
¢ 
d 
8 8 
fi 
a 0 
| 
W 
te 
a 
0 
| | 
4 
i 
pom 
; 
op 
18 
| 


LINEAR DIFFERENTIAL INVARIANCE. 393 


ferential operator, not necessarily of type P, into its adjoint linear differential 
operator. This adjoint operator ® is of class three in the above sense. 


2. Results. Some useful, not all new, properties of o are obtained. 
Two linear bases are found for the set of linear differential operators invariant 
under o. ‘Two invariant second order differential operators are found to form 
a fundamental system of invariant operators; i. e., any invariant operator may 
be expressed as a polynomial in these two operators. A linear basis is ex- 
hibited for what are called ¢-variants (Definition 3) with respect to o. 

Linear operational equations in o are completely solved in the case of 
constant coefficients. This is done with the aid of two theorems on the repre- 
sentation of linear differential operators in terms of ¢-variants or of invariants 
and pseudo-invariants. The same tools are useful in the solution of linear 
operational equations in o with variable (linear differential operational) 
coefficients, as is demonstrated in the example worked out in Section 10. 

In Section 9 certain results are specialized to yield a classification of the 
differential equations, such as (4) above, invariant under go. 


3. Preliminary definitions and lemmas. Since the only linear dif- 
ferential operators to enter this study are of type P, we shall often hereafter 
omit mention of this restriction. 


DEFINITION 3. If y is a linear differential operator such that oy = ty 
where 44 = 1, then y will be called a linear differential t-variant with respect 
too. A 1-variant will be referred to on occasion as an invariant and a 
(—1)-variant may be called a pseudo-invariant. 


DEFINITION 4. The degree and the order of a linear differential operator 
are respectively the highest power of the independent variable and the order 
of the highest ordered derivative appearing explicitly in the operator. 


Lemma 1. If y is a linear differential operator, then the degree of 
oy =the order of y and the order of cy = the degree of y. 


This lemma is an immediate consequence of the definition of o. The 
application of Lemma 1 leads to 

*E. D. Rainville, “ Adjoints of linear differential operators,” American Mathe- 
matical Monthly, vol. 46 (1939), pp. 623-627. For relations between o and the adjoint 
°perator, see L. Schlesinger, Handbuch der Linearen Differentialgleichungen, Leipzig, 


1895, vol. 1, p. 426 and E. D. Rainville, “A discrete group arising in the study of 
differential operators,” as yet unpublished. 


11 


y 
Co 
d 
y 
le 
ay 
s 
is | 
r 
n | 
yf 
| | 
i 


394 EARL D. RAINVILLE. 


Lemma 2. For a linear differential operator to be t-variant with respect 
to o it is necessary that its degree equal its order. 
Lemma 3. The linear differential operators A, = D? + x? and A, = 2’]) 
+ 2xD, are invariant with respect to o. 
We prove Lemma 3 by direct evaluation of oA, and aAsz. 
oA, = 2? + D? = Aj. 
cA, = —2Dr = + + 2 — —2 = Az. 
THEOREM 1. If A and B are linear differential operators, then 
a(AB) = (cA) (cB). 
Let v = 2*D*, then 
so that we have 
(8) a(zv) = (ox) (ov). 
Next, 
+ kak*D") =(— 1)* + (— 1) 
= (— 1)*r Dx" + (— 1) *k 2" + (— 1) = (— 1) = 
Then 


(9) a(Dv) = (cD) (ov). 
Theorem 1 follows directly from (8) and (9). Further, 
(10) =o[ (cA) (cB) ] = (0°A) (0B), 


and, for any integral k = 0, o(AB) = (o*A) (o*B). 

Lemma 4, [f v=a*D", then o?v = (—1)*. 

By (10) above 
= (0°2") (o?D") = [o(—1)*D*][ox"] = = (-—1)*™. 
Lemma 4 itself leads at once to? 


THEOREM 2. If y is a linear differential operator, then o*y = y. 


4, First classification of invariants. Direct application of Theorem ! 
and Lemma 3 yields 


THEOREM 3. Any linear combination of terms of the type 


7 Theorem 2 appears in Pincherle and Amaldi, loc. cit., p. 357. 


| 
i | 
| 
| | 
4 
| 
if 
i 
i 4 


LINEAR DIFFERENTIAL INVARIANCE. 395 


in which m);i1—=1,2,°- +> ,s8, are non-negative integers, is a linear differential 
invariant with respect to o. 


Next we obtain a simple necessary condition for invariance under o. 


LemMaA 5. A necessary and sufficient condition that a linear differential 
operator be invariant under o* is that, for each term dyna*D" of the operator, 
k=n mod 2. 


This follows at once from Lemma 4. Noting that invariance under o 
implies invariance under o*, we have a necessary condition for the former. 


THEOREM 4. For each term dxnv*D" of a linear differential invariant 
with respect to o it is true that k =n mod 2. 


DEFINITION 5. 'The leading term of a linear differential operator is the 
non-vanishing term of highest degree among those terms of highest order in 
the operator. 

It will prove useful to note that, since the order of the operator is the 
order of its leading term, we have 


LemMA 6. In a linear differential t-variant the degree of the leading 
term does not exceed its order. 


DEFINITION 6. By linear differential invariants of type H we mean the 
set of invariant operators 


and 


4(n—k)(k +1) 
Lemma 7%. The leading term of A,"*A.¥;0Sk Sn, ts 
This follows at once from the definitions of A, and A>. 
LemMA 8. The leading term of 


1 
4(n—k)(k +1) 


ig J)2n+1 


In the proof of Lemma 8 we shall use the convention that 
= DE +- DE +: 


means that y is a linear differential operator with leading term agea°D¢ and 
that the leading term of is D®. 
With the above convention note that the formula 


396 EARL D. RAINVILLE. 


holds for i = 1 and in a trivial sense for k =0. Assume (13) to hold for 
some k. Then 


A gp 2k+2 + (4k + Qh? +. | + 


and it follows by induction that (13) is true for any k= 0. Now 
(14) 4. 2k (2n 4... 


holds for n =k 0. Assume (14) to hold for some pair of numbers n, k. 
Then 

= 4 2(n +1) —k]a% 


so that by induction (14) holds for any n= k=O. 


In view of (13) the application of A:* to A,"-** is seen to yield 


forn=k=0. Combining (15) and (16) with & replaced by (k +1), we 
have as the leading term of [A,"*A,**!— A,*1A,"*]; 0k <n, the ex- 
pression 4(k + 1) (n—k)x**!D?"*1, so that Lemma 8 is established. 


THEOREM 5. A necessary and sufficient condition that there exist a linear 
differential invariant with respect to o with leading term a°D¢ is that either 


§ =e =0 mod 2, 6 
or 
§=c=1 mod 2, 

Lemmas 7 and 8 exhibit linear differential invariants for each leading 
term indicated in Theorem 5. We proceed to show that no linear differential 
invariant can exist with leading term not proportional to one of those indicated 
in Theorem 5. By Theorem 4 we must have §==emod2. By Lemma 6 we 
must have Se. We have left the one case '=e=—2h+1 and we con- 
sider that now. If a linear differential operator y had for its leading term 
a?1D?h*1, then oy would have for its leading term (— 2?**1D?"*1), and y could 
not be invariant. This concludes the proof of Theorem 5. We proceed t 
the main result of this section. 


THEOREM 6, Any linear differential invariant with respect to o 1 4 
linear combination of linear differential invariants of type H. 


{ 

| 


LINEAR DIFFERENTIAL INVARIANCE. 397 


Stated in another way, we show that the invariants of type H form a 
linear (infinite) basis for the algebra whose elements are the invariants with 
respect to o. 

Let y be any linear differential invariant with leading term as,a°D£, where, 
of course, 6 and ¢ are subject to the restrictions of Theorem 5. By Lemmas 7% 
and 8 there exists a linear differential invariant of type H with leading term 
aD*. Hence we see that there exists a linear combination of y and a linear 
differential invariant of type H (with coefficient of y not zero) which is in- 
variant under o and is such that its leading term is either (a) of lower order 
than the leading term of y, or (b) of the same order and of lower degree than 
the leading term of y. Repetition of this argument shows that there exists 
an identically vanishing linear combination (with coefficient of y not zero) 
of y and linear differential invariants of type H. Thus Theorem 6 is 
established. ° 

Next we note that, since no two of the linear differential invariants of 
type H have proportional leading terms, it follows that the linear differential 
invariants of type H are linearly independent. 

The preceding work, particularly Theorem 6, shows that A, and A, form 
a fundamental system of invariant differential operators in the sense of 


THEOREM 7%. Any linear differential operator invariant with respect to o 
may be expressed as a polynomial in A, and A». 


Of course, Theorem 3 has already stated that any polynomial in A, and 
A, is invariant under o. Since A, is not commutative with A., the word 
polynomial is used here in the sense of linear combinations of operators of the 
type exhibited in Theorem 3. 


5. Second classification of invariants. 
LEMMA 9. A necessary and sufficient condition for 
Tun = + o(a*D") ; 0= k,n, 
fo be a linear differential invariant with respect to o is that k=n mod 2. 


Noting that 
olen = (2*D") + 07 (2*D"), 


and recalling Lemma 4 we see that Lemma 9 follows at once. 


DEFINITION 7%. By linear differential invariants of type J we mean the 
set of invariant operators 


| 
e | 

n 
i 
0 

| 


398 EARL D. RAINVILLE. 


k=n=0mod2, 0Sk=n, 
and 
len; k=n=1mod2, 1=k<n. 
Note that in the set of linear differential invariants of type J whenever 
k An the leading term of Jin is aD"; if k =n, then k and n are even and 
the leading term of Inn is 22"D". Hence it is evident that the linear differential 
invariants of type J are linearly independent. 
Since all leading terms permitted by Theorem 5 are included in type J, 
we may follow the line of reasoning used to prove Theorem 6 and thus 
demonstrate 


THEOREM 8. Any linear differential operator invariant with respect to o 
may be expressed linearly in terms of linear differential invariants of type J. 

eee also the remark directly below Theorem 6. 

6. A classification of t-variants. We shall briefly indicate a classifica- 
tion of t-variants similar to the above second classification of invariants. This 
done, we may consider ¢-variants completely specified and may proceed to two 
representation theorems with the aid of which we solve linear operational 
equations in o. 

From Lemma 4 of Section 3 we get 


LemMa 10. A necessary and sufficient condition that a linear differential 
operator be pseudo-invariant with respect to o is that, for each term dynr*D" 
of the operator, k=n-+ 1 mod 2. 


Let i= V—1. If ¢t=i or if ¢—7*, then ¢? =—1 and any corre- 
sponding linear differential ¢-variant with respect to o is pseudo-invariant with 
respect to o*. If ¢—1, then we have actual invariance with respect to o°. 


Hence Lemmas 5 and 10 lead to 


THEOREM 9. For each term dyna*D” of a linear differential t-variant with 
respect to o tt is true that k=n-+ 4(1— #*) mod 2. 


LeMMA 11. A necessary and sufficient condition that 
== + t°o(a*D"), == 1, 
be a t-variant with respect to o is that k=n+4(1—??*) mod 2. 


Since 
ol =oa(a2*D") + to? (a*D") 
and . 
takD" + o(a*D"), 


a necessary and sufficient condition for the equality of o/‘*) and abhy is that 


de 


er 


on 


k 
p 
li 
0 
se 
ar 
in 
| 
wi 
13. 
se 
(1 
| 


LINEAR DIFFERENTIAL INVARIANCE. 399 


o?(a*D") = t?a*D". By Lemma 4 this is equivalent to t? = (—1)*™ or to 
k=n-+4(1—?#*) mod 2. 
Considerations similar to those used in the proof of Theorem 5 yield a 


proof (omitted here) of 


THEOREM 10. A necessary and sufficient condition that there exist a 
linear differential t-variant with respect to o with leading term a°Dé¢ is that 


either 

+ 4(1 — t?) =0 mod 2, (142), 
or 

=e+ 4(1—??) =1 mod 2, #) (1—?2). 


DEFINITION 8. By linear differential t-variants of type J we mean the 


set of ¢-variant operators 


I); 
and 


It can be seen that the linear differential ¢-variants of type J are linearly 
independent. Reasoning parallel to that used to prove Theorem 6 will 


demonstrate 


THEOREM 11. Any linear differential t-variant with respect to o may be 
expressed linearly in terms of linear differential t-variants of type J. 


See also the remark directly below Theorem 6. 


7. Representation theorems. We shall prove the following two theorems 
on the representation of linear differential operators of type P. 


THEOREM 12. Any linear differential operator of type P may be repre- 
sented in one, and only one, way in the form 


(1?) I+P+Q+W 


where I is an invariant, P a pseudo-invariant, Q an i-variant, and W an 
U-variant. 


THEOREM 13. Any linear differential operator of type P may be repre- 


sented in one, and only one, way in the form 
(18) I, + P, + 2(1. + P.) + + Ps) 


where I,,I.,1, are invariants and P,, P,P; are pseudo-invariants. 


) 


400 EARL D. RAINVILLE. 


In order to picture more clearly the relation between Theorems 12 and 
13, let us consider the operator 2x*D. We may write 


22°D = (— + 2? D— + (iaD? + 2’? D + AUD), 


where Q = — iz)? + and W=irD?+ 2°D + are respec- 
tively an i-variant and an 1°-variant. Here, though the operator 2z7D is real, 
the representation (17) introduces the imaginary unit. We may, on the other 
hand, write 


22°D = x«[(—1) + + 1)], 


where J, ==—1 and P,—2¢D+1 are respectively an invariant and a 
pseudo-invariant. Hence, using (18) the representation of 27°D is “ real.” 
The representations (17) and (18) play roles corresponding to the two solu- 
tions F = a,e** + a,e** and F =c,cosx+c.sinz of the differential equa- 
tion (D? + 1)F =0. 

The example in Section 10 illustrates the fact that (18) may on occasion 
have considerable advantage over the apparently simpler and more natural 
representation (17). 

The representation (17) is essentially a result of the fact that o satisfies 
the operational equation o* = F, the identity. 

Proof of Theorem 12. First we give an explicit expression for any term 
a*D” of a linear differential operator in the manner desired. Let k,n be non- 
negative integers. Then, using the notation of Lemma 11, we have 


Because of the linearity of the operators uniqueness of (17) will follow 
if we show that 


(20) I+P+Q+W=0, 


with the notation of Theorem 12, implies J =P—Q = W=0. 
lf we operate on (20) with o* we find 


(21) 


From (20) and (21) we have J+ P=0. Operating on this with o, we get 
I—P=0. Hence I —=P—0. But (20) and (21) also lead to Q + W=?, 
from which it follows that i@ —iW—0. Hence Q = W 0 and the proof 
of Theorem 12 is complete. 


| 

( 

( 
W 

| 
F 
a 
( 
I 
0 

a 
th 
| 
| ( 
j 


LINEAR DIFFERENTIAL INVARIANCE. 401 


Proof of Theorem 18. In order to show the existence of the representa- 
tion (18) we write the identity, for & and n non-negative integers, 


+ (— + 1] 


+ sgn k) (1 (— + I, 


in which sgn & is the usual signum function with argument k. 


In order to prove the uniqueness of the representation (18), we may 
follow in part the method used in the proof of Theorem 12. Assume 


(23) I, + Py + P2) + + Ps) =0, 
with the notation as in Theorem 13. Operating on (23) with o* we find 
(24) + + Py) — + Ps) 0. 


From these two equations J, P, = 0 and hence /, = P,; = 0 follow. We 
are left with 
(25) Pe) + P;) =09. 


Here the method of proof digresses from that above. We recal! that the degree 
of a linear differential invariant equals its order and that the same is true of 
a pseudo-invariant (Lemma 2). Hence the degree of (J:-+ Pz) equals its 
order, say n>. Also the degree of (J; + P;) equals its order, say n;. Since 
the degrees and the orders of the two terms of (25) must be respectively equal, 
we have no + 1==n; and n»=—n,+1. But no pair of values nz and ng can 
satisfy these relations. Thus we have 2(/.-+ P.) =0 and D(J; + P;) =0. 
It follows at once that J, = P, = 1, = P; = 0. 

Suppose an operator y is represented in each of the forms (17) and (18). 
It is then easy to obtain J, P, Q, and W in terms of J, Iz, I3, P:, Ps, Ps. 
We find J = 1,, P= P,, 


(26) Q = 4(D—ie)|Is + ile) + + ix) (P,—iP2), 


and 


(27) W =4(D + iz) (I, —ilz) + — iz) (Py + iP2). 


An examination of (20) and (22) leads us to believe that the determination 
of I;,T,, P2, P, in terms of Q and W is a term by term affair not to be written 
i such simple forms as (26) and (27). 


EARL D. RAINVILLE. 


8. Linear operational equations in ¢: constant coefficients. Next we 
consider linear operational equations in o in analogy to linear differential 
equations in D. The unknown to be determined is now a linear differential 
operator. Since o*y=vy for y any linear differential operator, we consider 
only operational equations of “ order ” = 3 in o. 

First we treat the homogeneous equation 


(28) az0°y + + + = 0, 


where 3, dz, 4;, M are constants. We may operate on (28) with o and obtain 


+ dyo*y + dooy + = 0, 
ayo*y + doo*y + + doy = 0, 


For (28) to have a solution other than y= 0 it is necessary that the de- 
terminant of the coefficients of y, oy, o*y, o*y, in the above equations vanish. 
This leads at once to 


THEOREM 14. A necessary condition that there exist a non-vanishing 
linear differential operator y which satisfies 


(28) + + doy + doy = 0, 


where dz, d2, are constants, is that 


(29) (ds + + + do) (43 — az + 4; — [ + (42 | = 9. 
If we now puty=/+ P+ Q-+ W, Equation (28) yields 


(a; + + a, + =A, =0, 

(dz — + a, —a)) P = ALP = 0, 
(a3 — a,) — do) |Q = = 0, 
[ (dg —a,) + 1(d2— do) |W = A,W 


(30) 


with the aid of Theorem 12. By means of Equations (30) we are able to 
determine the general solution of (28). For example, if A, =A, = A,=? 
and A, ~ 0, then the general solution of (28) is y=1+Q-+ W, where / is 
any invariant, Q any 1-variant and W is any 7*-variant. 

We turn now to the non-homogeneous case. 


THEOREM 15. Let ds, dz, do be constants such that A, 9; 
where the A’s are as defined in (30). Let I, P,Q, W be as defined in Theorem 


de 


402 
] 
§ 
i] 
il 
e 
fi 
W 
tl 
( 
|_| 
al 


LINEAR DIFFERENTIAL INVARIANCE. 403 


12. Then there exists one, and only one, linear differential operator which 
satisfies the equation 


(31) + dzo*y + + 
namely 

I 4 Q W 


That (32) is a solution of (31) may be seen by direct substitution. If 
there were two distinct solutions of (31), the non-vanishing difference of those 
solutions would satisfy (28), the homogeneous equation. In view of the in- 
equality of Theorem 15 and the necessary condition in Theorem 14, this is 
impossible. 

Equation (32) is readily altered to fit the case where the homogeneous 
equation also has a solution. Let us suppose, for example, that in (31) we 
find A, = Ay = 0, A,A, 40. Then there exists no solution of (31) unless 
P=0 and W=0. If these conditions are satisfied, the general solution of 
(31) is 


Wi, 


= — Py+ 
where P, is any pseudo-invariant, W, any 7*-variant and the other symbols are 
as in Theorem 15. There is here a noticeable resemblance to the general 
solution of a non-homogeneous ordinary linear differential equation. 
If in Theorem 15 we use the representation of Theorem 13, instead of 
that of r heorem 12, we need only to replace (31) by 


(31’ ) al. + loo” + ay=T, -++- Py + + -+ + DU; + P,), 
and (32) by 


9. Linear differential equations invariant under 6. We have inci- 
dentally solved the problem of determining what linear differential equations 
are mvariant under ¢. Theorem 14 and the remarks following it yield at once 


r ° ° 
THEOREM 16. Let y be a linear differential operator of type P and let 
* be an undetermined function of x Then a necessary and sufficient condition 
. . . . 
that yF = 0 be invariant under o is that y be a t-variant with respect to o. 


404 EARL D. RAINVILLE. 


This is an interpretation of the fact that for cy = cy to have a solution 
for constant c it is necessary and sufficient that ct = 1. 

By means of the representation (18) of Theorem 13 we are able to state 
this result in another form sometimes more useful. 


THEOREM 17. Let y be a linear differential operator of type P and let 
F be an undetermined function of x. Then a necessary and sufficient condition 
that yF = 0 be invariant under o is that y be expressible in one of the four 
forms (33)-(36) ; 


(33) y=h, 
(34) y=P,, 
(35) y= 2(I,-+ +iD(I,—P,), 
(36) +P.) — 


where I,, I, are invariants, P,, P. are pseudo-invariants, and i= V—1. 

10. Linear operational equations in ¢: variable coefficients. We may 
generalize equation (31) to the case where the coefficients are themselves 
linear differential operators. We consider 


(37) d3(0°y) bs + d2(0*y) + ai(oy)bi + aoybo = A, 


where the ai, b;; 10,1, 2,3, and A are known linear differential operators 
and y is to be determined. We shall illustrate our two methods of attack on 
(37) by means of a numerical example. Consider 


(38) roy — Dy = 0. 


If we use the representation y=J+P+Q-+W as indicated in 
Theorem 12, we find that ° 


(39) P)—D(Q+ W) =0. 


Operating on (39) with o,o*,o*, and combining the resulting equations, we 
readily obtain J=0, P = 0, and 


(40) (D—iz)Q+ (D+ic)W=0. 


Further, if we substitute y= @Q-+ W into (38) we get (40). Hence, the 
general solution of (38) is y—@-+ W where Q and W are respectively any 
i-variant and any 1°-variant subject to the restriction (40). 

Let us now attack (38) with the representation made available by Theorem 
13. In this case the solution appears in a more satisfactory form. Let 
y=1,+P,+2(1,.+ P2) + Then from (38) we get 


M 


| 
( 
( 
|: 
a 
| 


LINEAR DIFFERENTIAL INVARIANCE. 405 
(41) D)I, — D)P, 
— (22D + 1)1,— P, + — Ps) + Ps) = 0. 
Using o on (41) we arrive at 


(42) D)I,— (D—z)P, 
+ + D? (1, + — P3) = 0. 


Equations (41) and (42) combine to yield 2D(J, + Pi) =0, hence J, = P, 
=(. We return to (41) which has become 


(43) (22D + Ps) + D?(Iz + =0. 


Substituting for P, from (43) into the assumed expression for y we get 


y = — — — D — I, — (xD? — D 4+ 2°) 
Since J, and J; may be any invariants and P; any pseudo-invariant, we shall 
write 
(44) y= (cD? —D—2')I, + (aD? — D+ 2*)P,;. 
Then 
toy = 1, + Da? + D*)I, Da? — — D*)P; 
= 2«(¢D* + 2D)I,+ — D — + «(D* + + 
and 
Dy = + 22D)I,+ — D — 32’) + + + 327) Ps. 


Thus we have the result: the general solution of (38) is (44), where /, and 
I, are any invariants and P; is any pseudo-invariant with respect to o. In 
this case the representation in Theorem 13 is seen to have a considerable 
advantage over that in Theorem 12. 


UNIVERSITY OF MICHIGAN. 


n 
1 
| 


ON THE MINIMUM NUMBER OF POLYGONS IN AN 
IRREDUCIBLE MAP.* 


By C. E. WINN. 


In a recent paper Franklin‘ proved the number of polygons? in an 
irreducible map M to be at least 32. It is proposed here to shew with the help 
of certain new reductions that the number is at least 36. 

Our main object is to set an upper limit on the number of pantagons 
touching a given polygon of M. When the contacts are consecutive, we use 
the fact that 


A. A polygon of 5, 6, 7 or n >7 sides is reducible when in contact 
respectively with 3, 3, 4 or n-—2 adjacent pentagons.® 


When a pentagon has separate contacts with other pentagons, we note its 
reducibility * if it touches the chain 5665. And, combining with A, we find 


B. A pentagon is reducible when in contact with 4 minor polygons of 
which the extremes are pentagons. 


Using the fact that 


C. A hexagon in contact with the chain 5565 or 55665 is reducible, 


we shall prove a result analogous to B, namely ’ 


D. A hexagon is reducible when in contact with 5 minor polygons of 
which the extremes are pentagons. 


This still allows the possibility of a hexagon of M touching two separate 
pairs of pentagons and two major polygons. But in this case we observe that 


E. A hexagon touching two separate pairs of pentagons: is reducible when 
both pairs are in triad with another pentagon.® 


* Received June 24, 1938. 

+“ Note on the four color problem,” Journal of Mathematics and Physics, vol. 16 
(1938), p. 172 (published at Mass. Inst. of Technology). 

2 In an irreducible map every region is either a minor polygon of 5 or 6 sides, or 4 
major polygon of more than 6 sides. 

* The only recent case is the third, given by the author, “On certain reductions in 
the four color problem,” Journal of Mathematics and Physics, vol. 16 (1938), Pp. 159. 

*C. E. Winn, “ A case of coloration in the four color problem,” American Journal 
of Mathematics, vol. 49 (1937), p. 515. 

® Loc. cit.*. Unfortunately the claim in footnote 17 turns out to be unfounded. 


406 


m 


ab 


C0) 


rec 


ma 


| fo 
= 
th 
| 
a | 
| 
tri 
sti 
fin 
= 
is 


THE MINIMUM NUMBER OF POLYGONS IN AN IRREDUCIBLE MAP. 407 


As regards separate contacts with a heptagon, it is known that 


F. A heptagon touching 4 pentagons and 3 hexagons in any order is 


reducible.® 
We supplement this result by proving that 
G. A heptagon in contact with the chain 55655 is reducible. 


The details of the new reductions appear at the end of the paper, as 
well as those of a few configurations not employed here. The latter are as 


follows: 


A pair of pentagons in triad with a heptagon and touching no other 
major polygon. 
A pair of hexagons in contact with 55655 or 556655. 


The next configuration is obtained by introducing into the ring of Errera ® 
the triad 575 occurring in a recent reduction of Franklin,’ an odd number of 


hexagons being allowed. 


Any ring formed of pairs 5% and, optionally, pairs 55 and hexagons in 
any order, the pairs 5% being oriented in one direction and each in triad with 
a pentagon that touches no other polygon of the ring. 


As in Errera’s case an isthmus in the reduced figure implies a Birkhoff 
ring in the original map when the ring encloses a single polygon, a pair or a 
triad—otherwise it may invalidate the result. Simple instances of unre- 
stricted reducibility are those of 5(5)7666 about a pentagon and 5(5)765(5) 76 
about a hexagon, the digit in brackets denoting the pentagonal ‘cap.’ The 


final configuration is a modification of this type, namely 
The ring 5(5)%7665 about a pentagon. 


If a; be the number of pentagons A; in M, and js, be the number of their 
contacts with higher polygons An, we shall have 


(1) + 2 jon = 4as. 


For the contribution of A, to the left member (which cannot exceed 10) 
is at least 4 when it touches one or no other pentagon. Also in view of the 
reductions A, B and the fact that an irreducible pentagon touches at least 


*“Une contribution au probléme des quatre couleurs,” Bulletin de la Société 


mathématique de France, vol. 53 (1925), p. 42. 


| 
| 
| 
. 


408 C. E. WINN. 


one major polygon,‘ the possible contacts of A, with more than one pentagon 
are 5SN5N, 55nnN, 55nNn and 5nN5n, where n = 6 and N = 7, as hereafter. 
So in general the contribution is seen to be at least 4. 

Let us now denote by A;‘”) an A; contributing 4+ r to the left of (1). 
Then a;‘") being the number of such pentagons, we get 


6 
(2) joo + 2X jsn = 405 ras". 
n=7 r=1 


Further, let An‘” be an A, touching r pentagons. Their number being 
an‘), it follows from the last case of A that 


n-2 
(3) Jon — Tan". 
Tr 


The combination of (2) and (3) leads to 


4 n-2 
+23 =4a,+ Dd ras, 
r=1 


n=7 r=1 


n-2 
whence, seeing that a, = > an", we get 
r=1 
6 
(38n —17)an + ae + = 4a5 + 
n=6 r=1 


+> (38n —r—17%)a,™. 


n=7 r=1 
Consequently, if we shew 


6 n-2 
(4) ae) + 2a, + 2a, <= Sra; (8n—r—17)an™, 
r=1 


n=7 r=1 
the negative term, given by n = 7, r = 5, being omitted from the double sum, 
it will follow that 
(8n — 17) an = 2as. 


n=6 


Then we shall obtain by Euler’s relation, as required, 


(5) a5 + an => 8a; — 3 (n— 6) dn = 36. 


n=6 n=7 


It may be remarked incidentally that, if no two pentagons of an tre 
ducible map are adjacent, then at least 18 pentagons touch 3 or 4 hexagons.’ 

In fact, denoting the number of pentagons required by a; and as” 
respectively, we have, since the contracts jsn are separate, 


7Cp. Reynolds, “On the problem of coloring maps in four colors,” Annals of 
Mathematics, vol. 28 (1926), p. I. 


wh 


(or 
both 
if th 
whic 
sing 


sour 
a8 0) 
(all 
of A 


ag 


pol 
tril 
As 
at 
con 
A, 
f be 
b(6 
is | 
elen 
conc 
with 
chai 
or ( 
= | 


THE MINIMUM NUMBER OF POLYGONS IN AN IRREDUCIBLE MAP. 409 


Jan = jon = 3a, 
= 3643) (n—6)an— ay’ — 205”, 
n=6 


whence 
as’ + 2a,” = 36. 


To establish (4), we shall set against A,“*’, A,‘*) respectively one or two 
polygons A,‘"), A,‘ adjacent to them as compensating elements which con- 
tribute to the right-hand side; and against A,;"*) one or two such elements 
4. It will then be necessary to verify that the number of sources of a given 
eement, after reckoning twice the source A,‘*) yielding a single element, is 
at most equal to the corresponding coefficient on the right of (4). 

From C and D we deduce that a hexagon of M that makes separate 
contacts with 3 pentagons must touch at least two major polygons. Thus 
A, is bounded by either 5n5N5N, 55N5Nn or 5N5nN5. In the former two 
cases we take as our element the last pentagon a adjacent to A,*), which is 
bounded by 6NmmN, where m = 5, as hereafter. 

In the last case let bede be the last four polygons about A,“*), and let 
f be the outside polygon touching de. Then, if ¢ > 6, we choose the element 
b(6NmmN). If c—6 and f —5 or N, we choose e, which, on account of A, 
is bounded by 65N5N or 65mNN respectively. Finally, if c—f—6, our 
element is d(66- - -65). If dis an A;‘”, we infer from F that 7 < 3, so that 
A," does not appear as an element. 

The contacts of A,‘*) are 55N55N in view of A and C. Moreover, we 
conclude from ZF that one of these pairs of pentagons g, g’ are not in triad 
with a third pentagon nor, by A, with another hexagon, when g, g’ are in 
chain with a third pentagon. We here select two elements, namely g, g’(65NmN ) 
or (656nN 

On account of A and G@ the ring round A;‘") is 555N55N. If the fourth 
(or last) polygon is also an A;), we take the two pentagons h, h’ touching 


both A;"°"’s, These are both A,?)’s, their contracts being 75N57 by A. But, 
if there is no adjacent A;), we take the extreme pentagon i of the first three 
which, in virtue of B, touches an outside major polygon. We have then a 
‘ingle element bounded by 75NmN’, where N’ is not an A,), 

In each of the above rings about an element we have placed first its 
source. Consequently, a polygon with such contacts may occur as an element 
as often as one of its adjacent polygons fits into the first place of the ring 
(allowing for a reversal). We have thus to examine the possible occurrences 
where [<r < 5, and of where / = r = n—2 or 3, according 


48 1 18 greater than or equal to 7. 


12 


410 C. E. WINN. 


The incidence of A; is at 
a,b(6N55N); e(655NN); g, 9’(6566N). 

There is no repetition here, since the two adjacent hexagons touching 
g or g’ cannot come first in any of the four rings. 

The incidence of A,;‘) is at 

a,b(6N56N); e,9,9'(656NN); h,h’(V5N57) 1(75NGN’). 

We observe that this element occurs twice at most in the first two rings, 
which contain only two hexagons; also these rings are distinct from the last 
two. Now, by supposition, the last polygon of the third ring is an A,"°), 
whereas the last in the fourth ring is not. Hence an element A;‘?) can only 
appear twice in the third ring and once in the last, but not in both. This 
yields altogether a maximum of 2 occurrences, counting that at 7 twice. 

The incidence of A; is at 

a,b(6N5NN) or (6N66N); e(65NNN); i(75N6N’). 

This element occurs only once in the last ring, as the third polygon, being 
next to a hexagon, is not an A;‘*), It can then fall but once elsewhere, 
namely in the first ring. Thus the maximum amounts to 3, seeing that none 
of the first three rings contain more than 3 hexagons. 

The incidence of A; *) is at 

a,b(6NN6N); i(75NNN’). 

The N between N and N’ not being an A,‘*), we infer that this element 
can only fall twice in the last ring, and not then in the first. Hence, as the 
first ring contains but two hexagons, the maximum here is 4. 

Lastly, and A; is only to be found once, at a,b(6NNNW), while A;'" 
does not occur at all. So altogether the number of pentagons compensating 
A,"*), A,, and A;‘*) is not in excess of the first sum on the right of (4). 

The number of occurrences of An‘) at d(66- - - 65) cannot exceed n—!, 
i.e. the number of hexagons touching d, which is at most equal to the coeff 
cient of a,” in (4), unless n = 7, r = 2 or 3. Moreover, in the last two cases 
the largest number of hexagons coming third in the sequence 6566 (or second 
in 6656) is found by inspection to be 4 or 2 respectively, i.e. not more that 
the coefficient of a;. This concludes the demonstration of (4), and % 
of (5). 

We now reduce * the cases of D not contained in A or C, namely a hexagol 
touching 56565 and 56665. 


5 


* The scheme of reduction is that explained in loc. cit. +. To accommodate a tras 
formation in one line a comma is sometimes used to mean ‘or,’ when it could not meal 


ad 
th 
in 


THE MINIMUM NUMBER OF POLYGONS IN AN IRREDUCIBLE MAP. 411 


N56565. See Fig. 1. 


Ves 


(1) d, f, h=2; f=3 or h=2, unless dfh = 332, 342: u=1 


(2) dfh = 243 


12 e to g or f=3,9—s,4 e=2 
32 h to @ or e i=4,d=1,4 u=—4 h=2 
34 b to f c=] b=4 
13 g to % h=4 “= 
13 g tod 1=3 
32 e to a b=1,d=4 u=4 e=3; c,g=2,3 
u=3 
(3) dfh = 244 
43 f to b u=4 f=3, h=3,4 
(1) 
(4) dfh = 223 
13 g to e ore f=4, d=2,4 C2 CE) g=3 
34 g to b 2 u=2 
34 g to 4 ghi = 424 
(23 a to f or h g9=4,1 w=] 
23 d to f or h a=3 “u=2 abcd = 3248 ° 
24 b to 1 a=] uw=4 b=4 
12 etocorh u=2 e=2 
23 c to a or h b=1,1=—4,1 2 c=3 
u = 4) I= 
| 
(5) djh = 224 
31 g to h=2 “= ] 
31 g to e or f=4, d=2,4 (2), (4) g=3 
34 g to b a=] u=2 g=4 
41 g toi | 
41g toe ore f=3, d=2,3 
(4) 
(6) dfh = 323 
13 g to e f=4 (1) t 
34 ito g h=2 u=3 i=4; b,d=3,4 
u=] 
‘and.’ Thus a,t = 2,3 means a=2 or 3 and b=2 or 3. Also, if a chain affects 


adjacent polygons, it suffices to note the change in one of them. It should be added 
that, unless otherwise pointed out, an isthmus in the reduction implies a Birkhoff ring 


In the original figure, as can be at once verified. 


The absence of a 2 3 chain from d to b allows a = 3, as just given. 


| 
| 
() 


412 C. E. WINN. 
(7) dfh = 324 
14 9 toe ore f=3, d=2,3 “= g=4 
12 h to f ora g or i=3 
12 h toc g=3,d=4 u=] h=2 
23 a to d or f c=4,e=1,4 u= 
23 h to d or f g=—1, e=4,1 u =] a=h=3 
42 ito g or b h ora=1 u=1,4 i=2 
41 g toe ore f=3, d=3,2 w=] g=1 
24 b toi f=4 “= 1 b=—4 
43 b tod or h c=2 o0ri=1 u=2,1 b=3 
32 b to f c=e=4 u=3 b=2; =3,2 
u=2 
(8) dfh = 332 
23 h to f g=4 vw=3 h=2; d,b=2,3 
(1) or equivalent 
(9) dfh = 342 
41 etocort d=2 or h=3 e=4 
12 f toh ore g or e=3 “= f=2 
32 d to a or h o=4,i=—1,4 u=4 
32 f toaorh g=1, u= 3,1 d=2, f=: 
42 etogora f or b=1 u=1,3 e= 
41 g toiord h or e=3 u=1 I= 
13 g toi ore h or e=4 
21 h to f I= u=1 
21h toc =] h=1 
14 a to f g=2,i1=3 “= a=4; c,h=1,4 
u =] 
(10) dfh = 423 
14 i to g or e h=2, f=2,3 (1) 1i=4 
24 f tod ori eor g=3 u=3,1 f=4 
43 f to b c=e=2 u=4 f=3; dh=3,4 
u=3 
(11) dfh = 424 
14 g toe f=3 (1) g=4 
43 g to b a=l1,h=2 
43 g toi ghi = 323, d= 4,3 g=38; d=3,4 
“= 
N56665. See Fig. 2. 
(1) unless dfh = 243 (342) 


. 
i 
I 
i 


THE MINIMUM NUMBER OF POLYGONS IN AN IRREDUCIBLE MAP. 413 


(2) 7=4 unless dfh = 222, 244 (332) w= 1 


(3) j= 4, dfh = 222 


12Zitoad i= 3 u=1 a=1 

42 tod a=3 u=] 

42 j to f 1=9=>3, d=4 u=3 j=2; h=2,4 
u=2 


(4) j=4, dfh = 244 


14 c toe d=3 1 c=4 
43 to b or h a=1o0r+t=2 u=4 
43 j to f s=f=3 
(23 d to b c=] u= d=33 f=3;2 
u = 3) j=3 
23 d to b c= u=1 d=3 
42 ctoaorh 7=3, 1 
42 c to f 
(21 c to a or e bord=4 u=] 
21 c to h=3 u=1 c=1 
u=1) c=2 
34 b tod c=1 u=!] 
34 b to f d=4 b=4; h,j=3,4 
u=1 or (38) 
G. NN55655. See Fig. 3. 
(1) d=3 or h=2 or f=4 u=] 
(2) dfh = 444 
43 d to b d=3; f,h=—4,3 
uw] 
(3) dfh = 244 
13 i to g, e,c h = 2, ete. (1) t= 3 
23 d to b c= 4 u=4 d=3 
31 i to b or g a=4 or h=2 u=1 =] 
(1) 
(4) dfh = 243 
14 e toc or i d=3 or h=2 (1) om 
34 b to h or e u=2,1 b=4 
14 ¢ to e or i d or 2 c= 4 
12 b to d or f 45 u=3 b=2 
24 4 to g or b h=1ora=3 u=4 i=4 


: A pair of pentagons in triad with a heptagon and bounded elsewhere by 
minor polygons. See Fig. 4. 


| 


414 C. E. WINN. 


The polygons g,i and f,j must be hexagons on account of A and B 
respectively. The case where h is also a hexagon has lately been reduced by 
Franklin.! The remaining configuration, where h is a pentagon, can be 
colored immediately in the present reduced figure: 


(1) a=mb=—2 u=2,7=—1 


(2) a=—2, b=3 v—1, unless d—3, c=—4; then u—1l, 
(3) u==3, v=—1, unless d=2, c=—4; then 


(4) a=3, b=2 u=—2 or 4, 

or 4. 

A pair of hexagons touching 55655 or 556655. See Figs. 5 and 6. 

We may suppose that a hexagon of the chain forms a triad with the viven 
pair, as otherwise we get the reduction C. 

We can color both figures immediately by marking wu, v, b with 1, 2, 3. 
Then either 3 can be used for c, or else we get a choice for e (similarly for 
d and f). 

There is an obvious extension when u, v have 2k, 21 sides and touch 
hk; — 2, 1—2 pairs of pentagons. 

The reduction of the modified Errera ring # together with its pentagonal 
caps is made by suppressing these except for alignments '° connecting the 
remaining free vertices of caps, pairs of pentagons, hexagons and heptagons 
(see Fig. 7). 

As to the coloration of R, we note that, just as in Errera’s case a hexagon 
or pair of pentagons can always be colored when one neighboring polygon of 2 
is already marked. Moreover, if the polygon a following cb (57) with the 
cap d is already marked, we can always color bdc. For b is adjacent to two 
other colors, namely that next to the part of R including c and the other color 
bounding the unreduced alignment Z of 6. The marking of 6 leaves 3 colors 
next to d. Then likewise 3 next to c. 

We now have two possibilities according as the unreduced polygon ¢ 
abutting d bears the latter color bounding LZ or not. If so, we mark ¢ with 
this color and fill in R going away from b, with a final choice for bd. But, 
failing this combination at any pair 57, all such pairs can be colored in the 
reverse direction when the polygon of R previous to the pentagon is already 
marked. Consequently we can then fill in R, starting from b with the color o! 
e, passing through a and finishing with a choice for cd. 


7°The alignment crossing R at a pair of pentagons passes along their common 
side. Those crossing R at a hexagon or heptagon are kept apart by tracing them round 
the perimeter in the same sense from a given side of R. 


t 
W 
t 
by 
re 
9 
( 
by 
T 
(] 
| 
f 
} 
q 


THE MINIMUM NUMBER OF POLYGONS IN AN IRREDUCIBLE MAP, 415 


We observe that in the reduced figure alignments crossing R may divide 
it into a number of parts. So an isthmus is formed at an alignment if it is 
the only one to cross it. The other possibilities of an isthmus are 

(1) if a polygon uw makes two separate contacts with the same part of R, 
one contact only being reduced. 

(2) if two adjacent polygons v, w make separate reduced contacts with 
the same part of R. 

(3) if a polygon / makes separate reduced contacts with consecutive 
parts of A. 

When F encloses a single polygon a, no alignment crosses it. In case (1) 
we get a 4-ring formed by u, two polygons of # and @, or a 5-ring about more 
than one polygon including a cap. In case (2) a similar 5-ring is formed 
by v, w, two polygons of # and a. 

When F encloses a pair or a triad, it is crossed by 2 or 3 alignments 
respectively, or else two free vertices belonging to the same polygon or pair 
of pentagons on this side of FR give rise to a polygon of 4 sides or less. Cases 
(1) and (2) yield the same result as above for the part of R considered. 
Lastly, in case (3) we have a 5-ring about more than one polygon formed 
by t, two polygons in the two parts of Rk and two of the enclosed polygons. 


Thus the reducibility is unrestricted for the configurations in question. 


5(5)7665. See Fig. 8. 


(1) unless u= 
(2) d=e=—2 
24 d toa c=3 d—4 
(1) 


| 
| 
\ 
Fig. 1. Fig. 2 


Fig. 8. 


C. E. WINN. 


Fig. 5. 
a 
Fig. 7. 


EGypTIAN UNIVERSITY, CAIRO. 


416 


| 
Cc 
4 Ys h | 
2isy ¢ | 
a b | 
2 
Fig. 3. Fig. 4. 
2 1 2 1 
y ~ 
/ 7 e 
\ 
dj di vju 
3 > a 
a 3 ' 
: Fig. 6. | 
4 A 
j 
: 
! 
i 
a / \ d 


INFINITE PRODUCT MEASURES AND INFINITE 
CONVOLUTIONS.* 


By E. R. van KAMPEN. 


Introduction. The purpose of this paper is a systematic study of certain 
measurable functions on an infinite product space carrying a Lebesgue measure 
of the product type, especially the convergence theory of sequences of such 
functions and their distribution theory. Such a study is necessitated by the 
fact that these topics were considered during the last two decades from many 
different points of view by many authors and the development of the theory 
was quite slow. This warrants a uniform treatment of the central phases of 
the subject. An attempt will be made to approach each point by the method 
through which it is most easily accessible. There will result in this manner 
not only a systematical presentation of the general theory but quite naturally 
also several results which are not to be found in the literature. 

Although some references are given, no attempt has been made at a serious 
historical study of the subject. Numbers in square brackets refer to the list 
of references at the end of the paper. References to statement numbers in 
parentheses are preceded by the roman numeral of the Part in which the 
statement occurs, except if the reference occurs in the same Part. 

Part I concerns the theory of a product measure in a product space. The 
idea of such a measure developed from the theory of probability, cf. [1]. Later 
it took the form of a measure in certain special product ‘spaces defined by 
means of a measure preserving mapping, cf. [27], pp. 496-497, [2 bis], [28], 
[29], [25], [3], [9]. Finally it took the form of a product of measures in a 
product space, defined directly as the product of given abstract measures in the 
factor spaces, cf. [20], [22], [8]. In Part I only so much is stated as is neces- 
sary for the understanding of what follows. A proof is given of the 0 — I- 
theorem stated as I (6). The development of this theorem may be followed 
through a wide range of papers, for instance, [1], [28], [16], [21], [20], 
[9], [31]. 

A measurable function on one factor of the product space may be con- 
sidered as a measurable function on the product space which is independent 
of all but one of the codrdinates of each point of the product space. Part II 
concerns formal series of such functions, each series containing one term for 
each factor of the product space. The convergence theory of such series is 


* Received June 19, 1939; Revised January 18, 1940. 
417 


| 
i, 


418 E. R. VAN KAMPEN. 


easily accessible on the basis of the product measure introduced in Part |. 


The central theorem is the Three Series Theorem (Theorem 1), which contains 
necessary and sufficient conditions for the convergence of the type of series in 
question. This theorem is due to Kolmogoroff, [17] and [18], who was led 
to this problem in connection with a particular series of independent functions 
introduced by Rademacher, [26]. 

In Part III it is shown how a simple mapping may transform a sequence 
of independent functions in the sense of Kolmogoroff, [20], or equivalently 
in the sense of Steinhaus, [11], into a sequence of functions of the type con- 
sidered in Part II. Thus one can write, corresponding to every theorem of 
Part II, a corresponding theorem on series of independent functions. The 
mapping of the space of Part III on the product space of Part II is not a 
correspondence between points of these spaces, but a measure preserving corre- 
spondence between sufficiently extensive classes of measurable sets in these 
spaces. For considerations of the type used here such a correspondence is 
sufficient (cf. [32], $3). The last paragraph of Part III contains the negative 
answer to a question of Kac and Steinhaus, (cf. [30], § 6). | 

Part IV concerns the convergence theory of infinite convolutions. The 
results of Part II are transferred to the theory of infinite convolutions by 
means of Theorem V, cf. [10], Theorem 32. A first proof of this theorem is 
based on Theorem IV in § 10 and IV (6) in §17. Theorem IV, which is due 
to Jessen and Wintner ([10], pp. 84 and 85) is proved here by a method of 
Marcinkiewicz and Zygmund ([24], p. 119). The other result, IV (6), is 
usually proved by means of the theory of Fourier transforms (| 10], Theorem 
1). It is shown here that completely elementary methods are sufficient. A 
second proof of Theorem V is based on IT (17) and is independent of IV (6). 
On the basis of Theorem V a list of theorems is stated without proof in § 20 
and §21. The relation of Parts II and IV is much more complicated than 
the relation of Part II and Part III. For instance, it can hardly he said 
that Theorems I and VI are equivalent, even though their analogy is imme- 
diately obvious; Theorem VI is due to Jessen and Wintner, [10], Theorem 3+. 
Similarly, Theorem 3 of [15] corresponds to (and is used to prove) that part 
of the last statement of §11 which has so far been proved. It would be 
desirable to invert this process. In other words, a simple proof of the last 
statement of § 11 would lead to a shorter proof and a better understanding 
of Theorem 3 of [15]. 

The pure theorem (Theorem VIII of § 22) is a generalization of (17) 
which is due to Jessen and Wintner ([10], Theorem 35). The remark that 
(17) may be extended to cover the case of any Hausdorff measure was com- 
municated to me by Wintner. It may be of interest to investigate how far 


one 


for 
for 
lik 
pre 
| do 
ins 
ex 
St 
he 
| eve 
ot 
| res 
fu 
In 
tal 
ve 
fo 
di 
pl 
to 
su 
i th 
m 
tl 
i 
Se 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 419 


one may allow more general given pure functions o, in Theorem VIII. It is, 
for instance, obvious that if ¥%o, is convergent and o» is absolutely continuous 
for at least one value of n, then ¥on is absolutely continuous. 

It may be considered undesirable to prove a statement on convolutions 
like Theorem VI by means of series on infinite product spaces. However, at 
present it does not seem possible to prove Theorem VI without leaving the 
domain proper of distribution functions and their convolutions. Thus, for 
instance, the proof of Theorem VI which is sketched in § 20 includes a short 
excursion to the domain of Part I and an essential use of the theory of Fourier- 
Stieltjes transforms of distribution functions. An account of the latter may 
be found in [5] and many applications in [6], [10], [33], [35], [387]. How- 
ever, in view of the criterion in IV (1) for the convergence of a sequence 
of distribution functions, it may be considered probable that eventually a 
reasonably simple proof of Theorem VI within the domain proper of that 
Theorem will be constructed. A presentation of the theory of distribution 
functions as a whole may be found in a course of lectures by Wintner at the 
Institute for Advanced Study, 1937-1938; a previous presentation is con- 
tained in [10]. 

The functions of Part II are real valued and the distributions of Part 1V 
are 1-dimensional. This restriction is quite unessential. The extension to 
vector valued functions and more dimensional distributions involves only 
formal complications, but no essential difficulties. For such extensions in 
different situations compare [5], [6], [7], [10]. 

The convergence theory of series of independent random. variables is not 
discussed in this paper. This theory, which from the historical point of view 
precedes the others, represents from the methodical point of view, an attempt 
to combine the advantages of the other theories. Apparently this combination 
succeeds only at the cost of some clarity, so that it seems preferable to consider 
the points of view of functions of independent variables and of distribution 
functions separately. A comprehensive treatment of this side of the question 
may be found in the well known treatise of P. Lévy: Théorie de addition 


des variables aléatoires, Paris (1937). 


PART I. Product Measures. 


l. Let Xn, n = be an infinite sequence of sets and =—ILY\, 
the product set of the Xn, i. e., the set of elements 
(1) == {%} == Lo, Vs, ° “hy 


where the n-th codrdinate zp of v is an arbitrary element of Xn. This product 


satisfies the commutative and associative laws with regard to any form of 


i 


420 E. R. VAN KAMPEN. 


permutation and bracketing of the sequence of integers n —1, 2, 
If this sequence is divided into two parts I, II, and correspondingly one writes 
X =X; X Xu, and if C; is any subset of X7, then C; X Xx will be denoted 
by (Cr)x. As an example for this convenient notation one has (A; X A.)z 
=A, XK Xs XK if A1CXi,A2C Xe. The symbols », 
will be used to denote the products X2 Xn, Xnu XK 


) 


so that X = X.n XK Xn.; subsets of Xn, Xn. will be provided with similar 


subscripts. 


2. Let every space X, carry an absolutely additive non-negative measure | 
YnBn, defined for the sets B, belonging to the field B, of x-measurable sets, | 


and suppose that pnXn—=1. By the definition 


(2) Ii where B= (11 B,)x and B,C 
k=1 k=1 


and by subsequent uniquely determined extension (based, for instance, on the | 
method of the exterior measure), an absolutely additive, non-negative measure | 


»B may be defined for all sets B in a field B of w-measurable subsets of X. | 


This product measure p = In has the property that 
(3) The set An, where An C Xn, ts p-measurable if and only if either each 


An ts pn measurable, in which case = WynAn = Tp(An) x, or =?. 


Thus the use of » both in (2) and as a notation for the product measure 


IIy, is justified. In particular (3) implies that »X —1. The proof of the | 


above statements may be based, for instance, on the following intuitive lemma, 
used for this purpose by v. Neumann in a course of lectures at the Institute 
for Advanced Study, 1934-35. 


(4) If AnC Bn, A =A, and AC 3B", where each B” is a set of the 
type occurring in (3), then 

TpnAn S SmpB™ 
holds for the function p defined in (2). 

3. The measure p= IIp, satisfies the commutative and associative laws 
with regard to any permutation and bracketing of the sequence of intege!s 
n=—=1,2,3,---. In particular, if again Y =X, X Xn, then p= 
where jz, 11 are the product measures of the pp belonging to X7, X11. Thus 
the theorem of Fubini may be applied to any factorization of X into two 
factors. This proves, for instance, the following statement, if one considers that 
the product B.n X Cn. is equal to the common part of (B.n)x and (Cs )x 


(5) If Bn X Cn. are measurable sets in Xn, Xn. respectwely, then 
X Cn. = p(Bin)x (Cn.) x = p(Bn) x p(Cn) x. 


( 
| 
t 
j 
4 
| 
| 
| 


he 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 421 


The following well known theorem is important for the development of 
the theory of the product measure p = IIpy: 


(6) If a measurable set CCX is of the form (Cn.)x for every n, then 
either =1 or p(C) =0. 


The condition concerning C is to the effect that a point (1) in C remains 
in C if any one of its coordinates is modified arbitrarily. The proof of (6) 
proceeds as follows: 

A totally additive, non-negative measure function vB may be defined on 
B by the equation vB = »BC, where BC denotes the common part of B and 
(. If B is a set of the type B = (B.»)x then vB is, by (5), of the form 


Since the measure v is uniquely determined on B, by its values on all sets of 
the form B= (B.») x, this implies vB = »BC = pBpC for every measurable 
set. On placing B = C one obtains the statement (6). 


4, The following convention will be used concerning Xn, pn, X, pw, fn: 


(7) Xn is a space which carries a measure pn such that prnXn=1. The 
product space X = carries the product measure p= pn, so that pX = 1. 
The symbol fn represents m given pr-measurable function fn = fr(x) on X, 
and the same symbol represents the p-measurable function fn = fn(a) on X, 
which is defined by: fn(X) =fn(an) tf the n-th codrdinate of x is tn, cf. (1). 


Thus, for instance, in the symbol Sfn, the fn are thought of as functions 
on X, since otherwise addition would not have a meaning, and one finds for 
the k-th moment M;.(fn) of fn the two expressions 


M:.(fn) — fn(tn)*dX» = fin (z)* dX, 


if at least one of the integrals exists. The flexibility in the manipulation of 
integrals on X which one attains by means of Fubini’s theorem is illustrated 
by the following example, which is typical of many situations in Part IT: 


(8) Ifnsl<m, fn(@n) and fm(an) are integrable, and C is a set of the 
type C = (C.1) x, then 


fula)dX f fn (a) dX. 


A special case of (7) is represented by (9). The use of the r» on Z is 
equivalent with the use of the well-known Rademacher functions, cf. [26]. 


FAA FAA 


. . . 
(9) Zn is a space which consists of two points, Z’n, Zn, each of which has 


the vm-measure 4, so that vn4n=1. The product space Z =1Z, carries the 


2° 


ites 
ited 
n, 
lar 
re 
ts, | 
re | 
X. | 
ch 
(), 
re 
he 

la, 
te 
1; 
1s 


— 


E. R. VAN KAMPEN. 


measure so that vZi=1. The function is defined on Zy by 
Tn(Z'n) =1, tn(Z"n) =—1; and r, is defined on Z according to the last 
part of (7). 

Two functions f on \ and g on Y are said to be equimeasurable it 
pl f(x) =Al[g(y) for every real Here [f(2) >], for instance, 
represents the a-set defined by the inequality f(x) >o. Let the functions 
gn On Y» satisfy a convention similar to (7) and let fp and gn be equimeasurable 
for every n. If any limiting process (reducible to convergence in measure) | 
is applied both to the f, and the gn, then the resulting functions are defined | 
on. sets of the same measure and equimeasurable on those sets. 

A function f on X is said to be symmetrically distributed if p[ f(z) < | 
= p[f(z) >—o] for every real o. If the spaces Y, and Y, are in 1-1 | 


measure preserving correspondence, so that the same holds for XY and Y, and 
if fn(an) = gn(yn) if the points a, and yn correspond, then it is easy to see 
that the function f*n(a@n, Yn) =fn(@n) — gn(yn) is symmetrically distributed 
on XX Yn, hence also that f*,(a,y) is symmetrically distributed oni VY x Y 
= II(Xn X Yn). Moreover, M,(f*n)==0 and M2(f*n)= Mo(f*¥n)=2 Mo (fu). 
Here M.(f) denotes the value of the second moment of f — M,(/), i-e., the 
minimum value of the second moments of the functions f— const.; thus 
M.(f) M,(f) (M, (f) )*. 

It is also easy to see that if fn is symmetrically distributed on X,, then the 
functions fn on and on Xn Zn are equimeasurable, cf. (9). Thus, 
if f, is symmetrically distributed for every n, then any limiting process applied 
to the sequences {fn} on X and {rnfn} on K Zn) leads to 


equimeasurable results. 


PART II. Series of Functions of Independent Variables. 


In Part II, a number of more or less known criteria are given for the 
convergence of a series Sf, on VY —ILY,, where each fy = f(a) is obtained 
from a function fn = fn(#n) on «, according to the convention at the beginning 
of §4. The main criterion is stated as the “ three series theorem ” (Theorem I). 
It was obtained first by Kolmogoroff, who used the language of the theory oi 


random variables, and restricted these to take on an at most enumerable set 


of values. His original proof in [17] turned out to be most convenient for a 
systematic presentation of the subject ($5 and §6). Some simplification 
could be obtained. For instance, II (8) is replaced by IT (2), and the prod 
of II (6) is separated into two parts, the first of which proves the separate 
lemma, II (5). The presentation of the proof of Theorem I may be varied 


in numerous ways. 


i 
| 
| 
a 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 42% 


In §7 and §8 conditions are given for the unconditional and absolute 
convergence of Sfn on X. They follow easily from Theorem I. Theorem ITI 
is essentially simpler in character than Theorem I, from which it is here 
obtained, and may be proved directly by means of a lemma analogous to (5), 
concerning absolute convergence. Conditions as used in (12) of $9 occur 
for 7 = 2, p=4 in [11], Theorem 4, and for g=1, p=2 in [23], §3. 
Theorem IV of § 10 is essential for thé codrdination of Part II and Part IV. 


5. ‘This § 5 contains some lemmas needed in the proof of the three series 
theorem. The convention I (7) is of course essential. 


(1) The series Xfn is either almost everywhere convergent or almost every- 


where divergent. This is an immediate consequence of I (6). 


(2) If for an arbitrary K > 0 and every n, the function f'n is defined by 
fn=fn or f'n =K, according as | fx | SK or | fn | > K, and also if f'n is 
defined by f’n—=fn or f’n—=—K, according as | fn|SK or | fn | > K, then 
Xf, and Xf'n are simultaneously almost everywhere convergent on X and almost 


everywhere divergent on X. This is clear, since, for a ‘fixed 2, the passage 
from either of the series 3fn, Sf’n to the other involves only terms which are 


of absolute value not less than K. 


(3) Tf << + and M,(fn) =0 for every n, then is convergent 


almost everywhere on X. 


Suppose that M.(fn) exists and M,(fn) —0 for every n, and that 3fn is 
divergent on a subset of positive measure of X. It will be shown that 3M.(fn) 
is divergent. 

If s, denotes the n-th partial sum of 3f,, then the sequence {sn} is 
divergent on a subset of positive measure of Y. Thus at almost every x on_X, 
the oscillation of the SequeNCe Sins, Sms2,°°* is Not less than a positive number 
a@=a(r) which depends on x but not on m. Thus it is evident that, for 
sufficiently small b > 0, the set D° defined by the inequality a=a(«) >b 
is not a zero set. This means that there exists a number b > 0 such that D 
is not a 0-set, where D is defined by the condition that | s,—sm| > 6b holds 
for every m and at least one n= n(x) > m. If m is arbitrarily fixed, and 
k>m, let be the set defined by | Sn(@) Sb for m<n<k 
and | sk —s, | > b. The sets D* are disjoint, 3)* contains D and D* is a 
set of the type (A.,)x. Let nm be fixed in such a way that p % - > Spd. 

mokKSN 
Since M,(f;,) = 0, one sees from I (8) that ? 


(8, — sz) — f (Sn — dX - dX = 0. 
Dé 


by 
ist 
if 
ce, 
ns 
dle 
e(| 

id 
ee 
y 
ie | 
1s 
(| 
0 
yf 
t 
a 


424 E, R. VAN KAMPEN. 
Taking in account the definition of D*, one finds for m< k=n, 
J, (Sn — 8m)?d.X = ( (Sn — 8x)? + (8% — 8n)?) dX 

De Dk 


= (8% — Sm) *pdX > b*pD*, 
Dk 
so that 


(S — 8m)2dX 
k=m+1 X 


k=m+1 Dk k=m+1 


Since m was arbitrarily chosen, it follows that 2M.(fn) is divergent, so that 
the proof of (3) is complete. 


(4) If 3M2(fn) << + 0, then Sfn is convergent almost everywhere on X if 
and only tf 3M,(fn) is convergent; in which case Xfn converges in the mean 
(L*) on X to f = 3fn and 

Mi(f) = 2M, (fn), M2(f) = 2M 2(fn). 

In fact, if gn is defined by gn =fn—Mi(fn), then Mi(gn) =0 and 
0S M2(gn) = Me(fn), so that + 0. Thus according to (3), 
the series Sgn is convergent almost everywhere on X. The first part of (4) 
is now evident from the definition of gn. In order to prove the remaining 
statements of (4), put g = Sgn = f— 2M, (fn) and let ¢, denote the n-th 
partial sum of Sgn. Since the functions gn are orthogonal on 1, one has 


M.(g) = M.(trn) = 3 M2(gx.). From the equality one obtains by means of the 
k=1 

theorem of Fatou the inequality M2.(g)S 3M2(gn), so that Mz(g)= 3.M2(gn). 
Since the last identity may be read as the Parseval identity of the expansion 
g = Xgn, the remaining statements of (4) are now evident. 
(5) If |fn(x)| <K for every x and every n, if M,(fn) =0 for every n and 
tf Xfn is almost everywhere convergent, then %M2(fn) ts convergent. 

Let sn = Sn(x) denote the n-th partial sum of the series 3f,(a). Then 
the last assumption of (5) implies, in view of Kgoroff’s theorem, that 
| Sn(a2)| < N for some N > 0 and for every z in a set ( of positive measure. 


Let the set C” be defined by the inequalities | x(a2)| < N for 0<kSn, 
then C” is a set of the form C" = (A.n)x. Moreover, CO"? DO" DC. On 


placing T'n = and = —0*, one has 
cn 


T Tx-1 f + 2 f f 
Ck-1 Ck-1 


D* 


n n 
f 
| 
0 
0 
a 
ft 
n d 
a 
K 
th 
| fo 
{ M 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 425 


Application of I (8) gives the value M2(f,)»C*" for the first integral and 
) for the second integral, since M,(fn) =O for every n. Thus, since 
= pC and | |S K-+N on 

— Tra = Mo(fa) pC -— (K + 
Summation of this relation for 1< k= gives, since pD* = — pC* 
and pC" = 1, 


=T, 2=7T.—T,2 Mo(fr)pOC — (K+ N)?; 
k=1 


so that the convergence of %M.(fn) is evident. 
(6) If | fn(a)| << K for every x and every n, and if Xfn is almost everywhere 
convergent on X, then both series 3M,(fr) and %M2(fn) are convergent; 
so that Sfn belongs to the class (L7) on X, (cf. (4)). 

Let f*n(an, Yn) be the function defined in § 4, so that | f*n (an, yn) | < 2K, 
Mi (f*n) = 0, Mo(f*n) =2M2(fn) and is convergent almost everywhere 
on X¥ X Y=1(X, X Y,). From (5), which is thus shown to be applicable, 
one obtains the convergence of 3M.(fn). Since %(fn—M,(fn)) is convergent 
almost everywhere by (4), it is clear that 3M,(fn) is convergent. Finally, 
that Sf, belongs to the class (1?) on X, is evident from (4). 

6. From the statements (2) and (6) it is easy to obtain the three series 
theorem : 

THEOREM I (Three series theorem). Jf An denotes the subset of Xn 
defined by the inequality | fn | SK, then the series Sfx is almost everywhere 


convergent on X if and only if all three series 


are convergent for a fixed K >0, in which case they are convergent for every 


K>0. 


In fact, if fn is any one of the functions defined in (2) for n= 1, 2,3,°°-, 
then 3f, is almost everywhere convergent on \, if and only if the same holds 
for 3f'n. Application of (6) to Sf’n now shows that this is the case if and 
only if 3M, (f'n) and 3M.(f’n) are convergent. Since clearly 


Mi(f'n) = f fndXn + Kpn(Xn—An) and 
An 


f (NX, — Cf fndX»)? 
Ag An 
+ 2K pn(Xn — An) f fndXn + K?(pn(Xn — An) — pn? (Xn — An)?), 
Aa 


13 


| 


426 E. R. VAN KAMPEN. 


the convergence of 2M, (f'n) and 3M.(f’,,) for both functions f'n is equivalent 
to the convergence of the three series occurring in Theorem I. 

The following simple sufficient condition for convergence is an immediate 
consequence of Theorem I and may sometimes be easier of access; cf. [23] 
Theorem 5. 


(7) The series Xfn(x) is convergent almost everywhere on X if so are both 
series 3M,(fn) and fn |) for a fixed p, 1 < pS2. 


In fact, if the second series is convergent, then so are the series 


Spyn(Xn—An) and & f fn?dXn, where the A» are the sets 

Xn-An An 

defined in Theorem I. Thus (7) is an immediate consequence of Theorem I. 
On applying Theorem I to the functions f, defined by fn(a) =1 or 

fn(x) = 0 according ag x does or does not belong to a given measurable set 

By, on Xn, one obtains the following statement (which occurs as a lemma in 

the usual proof of Theorem 1; cf. [17], [18]). 


(8) If Bn is a measurable subset of Xn for every n, then the measure of the 
set of points x of X which have infinitely many codrdinates X,, in the sets Br 
is 0 or 1 according as Xpn(Bn) is convergent or divergent. 


As remarked by M. Kac, application of Theorem I and of the other con- 
vergence criteria of Part II, to the series 3f,/n leads to most of the well 
known sufticient conditions for the strong law of great numbers; cf. [11] p. 54. 
In fact, if is convergent, then (f: +°--+fn)/n70 For 
instance, on applying (3) to the series Sf,/n, one obtains the following 
sufficient criterion for the law of great numbers. 


(9) If Mz(fn) exists for every n, and if 3M,(fn)/n and 3M2(fn)/n? are 
convergent, then 


(fit: as noo, 
almost everywhere on X. 


7. Since in (4), the series }/.(fn) has non-negative terms, and sinc 
the product space of X and its product measure y satisfy the commutative law 
under any permutation of the factors X», and pn, one may read (4) as follows: 


(10) If M2(fn) exists for every n and if 3M2(fn) is convergent, then the 
sum fin is convergent almost everywhere, no matter in what fixed order the 
terms are taken, if and only if %M,(fn) ts absolutely convergent. : 

If t’ and t” denote the sums of two rearrangements of fn, then = 
almost everywhere on X. 


| 
I 
j 
| 
| 
j 
| 
| 
i 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 427 


In order to prove this last statement, let gn be defined again as 
{fn — Mi (fn), let tn’, tr” be the partial sums of the two rearrangements of 
3gn, and let T'n’, T'n’” be the partial sums of the corresponding rearrangements 
of SM2(gn). Then 


(ta! — ty’) = Ty! — Ty’, 
X 


so that the integral tends to 0, as n—>» oo. Hence, by Fatou’s theorem, 


f — =0, 
xX 


so that t’ = ¢’” almost everywhere on, Y. 
Using (10) instead of (4) in the proof, one obtains as in § 6 the following 
theorem, where notations of Theorem I are used: 


THEOREM IJ. The series 3f,(x) is convergent almost everywhere on X, 
no matter in what fixed order the terms are taken, if and only if the series 


(*) and & fn?>dXm are convergent and the series (**) is absolutely con- 


An 


vergent. Moreover, if t’ and t” denote the sum of two rearrangements of fn, 
then = t” almost everywhere on X. 


As an immediate consequence of Theorem II, one sees that if &fn is con- 
vergent almost everywhere on X, then constants a, may be determined in such 
away that the type of convergence of Theorem IT holds for 3(fn— an). One 
may select for instance as d» the terms of the series (**) of Theorem I. 


8. The type of convergence of =f, which was discussed in § 7 should be 
distinguished from the unconditional convergence of Sfn almost everywhere 
on X. The latter is equivalent to the convergence of % | fn | almost every- 
where in X, and obviously implies the type of convergence discussed in § 7. 
Simple examples show that the converse implication does not hold. For 
instance, the series r,/n (cf. I (9) ), satisfies the requirements of Theorem IT, 
but 3| 7, |/n is divergent everywhere on Z. 

A condition for the convergence of &| fn | almost everywhere on Y may 
be obtained very easily from Theorem I. In fact, the set An of Theorem I 
is the same for f, and | fn |. Thus the statement of Theorem I for the series 
x | fn |, is that this series is almost everywhere convergent on X if and only 


if the three series | fn | dz)2] and | fo | 
Ag Am n 


are convergent. Since f fr?dXy = Kf | fn | dXn, the result thus proved 
An An 


reduces to: 


Pn 
h 
t 
n | 
| 


E. R. VAN KAMPEN. 


THEOREM III. The series %f,x(x) is absolutely convergent almost every. 
where on X tf and only tf the two series 


—A,), | fn | 


are convergent for a fixed value of K > 0, in which case they are convergent 
for every K > 0. 

The simple sufficient conditions for absolute convergence in (11) are not 
equivalent for any two distinct values of p, cf. [34]. 
(11) Jf for any fixed p, 0< the series fn |) is convergent, 
then Sfn is absolutely convergent almost everywhere on X. 

In fact, if fn |) is convergent, 0 < then so are the two 
series of Theorem III, so that Sf, is absolutely convergent almost everywhere 
on X. 


9. The condition | fn|< K in (6) insures that the convergence of ¥f, 

almost everywhere on X implies the convergence of 3M2(fn). In this section 
a different condition will be discussed which leads to the convergence of a cer- 
tain moment series, if a series Sf,» of symmetrically distributed functions f,. 
In the special case g = 1, p= 2 of (12), the restriction to symmetrically 
distributed functions is not needed, as shown by Marcinkiewicz and Zygmund, 
[23], §3. The resulting theorem is given here for completeness as (13). 
The moment condition of (12) has the form of an inverted Holder inequality; 
thus c= 1 in (12) by Hoélder’s inequality. 
(12) Suppose that 0<q<pandc>O0. Let the functions fr on Xn be 
symmetrically distributed and let cMp(| fn S Ma(| fn |). Then the con- 
vergence almost everywhere on X of implies that SMp(| fn|)"” % 
convergent, where r= Max (q, 2). 


First it will be shown that m»,— 0, where mn = Mp(| fn |)1/?. If Bu, Cs 
are the ap-sets Bn = [| fn | > amn], Ca =[| fn | > mn/a], where a > 0, then 


My? = f | fn 
Cn 
hence Sa’. And, from Hélder’s inequality 


Since obviously | fn S a4m,4, it follows that 
Xn-Bn 
= 
p(B,— Cn) = a%mn4 | fn dX = — — amy"). 


Bry-Cn 


428 
| 
k 
W 
t 
| 

W 
t 
| 
| 
4 0 
| | 
| ( 
| 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 429 


Since p(Bm) = »(Bn—Cn), one sees that »(B,) > const. > 0 if a is selected 
sufficiently small. Now if limsup m, >0 and if in Theorem I, one takes 
K <alim sup my, then the series (*) of Theorem I is not convergent, since 
infinitely many of its terms are > const. > 0. Since this is contradiction 
with the assumption that =f, is convergent almost everywhere on X, it follows 
that mn — 0. 

Let An be the set An =[]| fn |S1]. Then 


pn(Xn — An) = f | fn |? dX n = 
Xn-An 
so that Holder’s inequality implies 
( f fa [¢dXn)? S pn (Xn-— An) | fn my”, 
Xn-An Xn-An 


Hence the assumption M,(| fn |) = emn’ implies that 


f | fn dXn = CMy! — Mr?, 
An 


where the right side is positive for sufficiently large n, since m,—>0 and q < p. 
For such n one now obtains from Holder’s inequality of 7 = 2 = q and from 
the definition of A, if r—=q=22: 


= ( f | fn = (Cmnt — my?) = My! 
Aa An 


This implies the statement of (12) that Xm,” is convergent. For mn— 0, 
c>0,p>q> 0, and the series (,*,) of Theorem I reduces to the form 


x fn?dXn, since fn is symmetrically distributed. 
An 


In case ¢ = 1, p= 2 the condition in (12) that fn is symmetrically dis- 
tributed may be replaced by the much weaker condition that M,(fn) = 9. 
In fact, if M,(fn) = 0 and cM.(fn)* M,(| fn |) for some c > 0, then the 
symmetrically distributed functions f*n of § 4 satisfy dMo(f*n)*S M,(| f*n |) 
for a d >> 0 which depends only on c, as shown in [23], p. 71. Taking in 
account that the condition cM.(fn)2 M,(| fn |) is homogeneous in fn, so 


that it is possible to normalize fn by placing Mz(fx) =1, one obtains one half 
of the following theorem of Marcinkiewicz and Zygmund, the other half of 
which is clear from (4). 


(13) If fn satisfies the conditions M,(fn) =9, Mo(fn) =1 and M,(fn), 
¢>0 for every n, then the series Xcnfn with constant coefficients Cy 1s 
convergent almost everywhere on X or divergent almost everywhere on X, 
lccording as is convergent or divergent, cf. [23], §3. 


| 

) 

| 


E. R. VAN KAMPEN. 


10. In this § 10 applications are given of a method of Zygmund ([38] 
$2). They will be useful to establish the relations of Part II and Part IV, 


(14) Let the n-th partial sum of the series Xcntn on Z =TIZ» be denoted by 
tn, where the ¢n are constants and the rn are defined in I (9). If a subsequence 
of the sequence t» is convergent almost everywhere on Z with reference to the 
measure v = then is convergent. Thus is almost everywhere 
convergent on Z. 


In fact, if a subsequence of {t,} is convergent almost everywhere on Z, 
then there exists a constant M, and a subset C of Z of positive measure, such 
that | tm(z)| < M holds on C for arbitrarily large values of n. Thus, 


m 
MC = J bm?dZ = + dZ, 
C n=1 


if 3’ denotes summation over the range 1=n<1l1=m. Now the functions 
Tri,” <1 are orthogonal to each other and orthogonal to a constant function 
on Z. Thus, the Parseval inequality, when applied to the characteristic 
function of C, shows that 


>’ ( ryridZ)? v0 — (v0)? S4, 
Jc 
so that, by Schwarz’ inequality, 


(23/cner- tat Z)? S43! rari dZ)? < 3 ( en2)2, 
C 
and finally 
M20 = ( —) 


n=1 


Since M and C may be selected in such a way that vC is arbitrarily near to1, | 
. this completes the proof that 3c,” is convergent. And now (4) implies that | 
| Cnn is almost everywhere convergent on Z. 


(15) Let be the n-th partial sum of on X =T1Xy. If the subsequence | 
{8n,} of {Sn} is convergent almost everywhere on X, then there exist constants 
An such that %(fn—4n) converges almost everywhere on X. 


Consider first the case where the distribution of fx on Xn is symmetric, 
t i.e., where the sets defined by fn > and fy <—vw have equal measures fo! 
| every real w. Then the function rnfn on Xn X Zn and fx on Vy are equi 
measurable, i.e., the sets defined by fx > on Xn and fntn > on Xn X% 
have equal measure for every real w. Thus if t, denotes the n-th partial sum 
ef the series 3fntn on X K Z = X Zn), then the sequence {tn,} is coh 


430 
i 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 431 


vergent almost everywhere on X XK Z. Thus, by Fubini’s theorem, {tn,} is 
convergent almost everywhere on Z at almost every fixed point of X. Hence, 
by (14), %fn7n is convergent almost everywhere on Z at almost every fixed 
point of X. In view of Fubini’s theorem, this implies that fn» is convergent 
almost everywhere on \ at almost every fixed point z of Z. Since the func- 
tions fy on X» and fngn on Xn X Zn are equimeasurable for every fixed zn in 
Z,, this implies finally that =fn is almost everywhere convergent on 1’, so that 
one may choose a, = 0 in the case under consideration. 

Now let fn be arbitrary and let f*n be the function introduced in § 4. 
Clearly, =f*n(z, y) has the same convergence property on X X Y as Sfp has 
on XY. Moreover f*n(@n, Yn) has a symmetric distribution on Xn XK Yn. Thus, 
by the case of (15) which has already been proved, Sf*n(a,y) is almost 
everywhere convergent on X XK Y. This implies that =f*,(2,y) is almost 
everywhere convergent on \ at at least one fixed point y) of Y. On placing 
fn(Yo) = Mn, One obtains the statement of (15). This proof was obtained from 
a proof in [24] by a slight simplification. 


THEOREM IV. Jf 3fn is convergent in measure on X =WXn, then Sfn 
is almost everywhere convergent on X. 


In fact, if Sf, is convergent in measure on X, then a classical theorem 
states that a subsequence of the sequence of partial sums of 3fn is convergent 
almost everywhere on XY. Thus, by (15), there exist constants a, for which 
X(fn— Gn) is convergent almost everywhere on X. ‘Since 3fn is convergent 
in measure on X, one may choose the constants @ equal to 0. Thus the proof 
of Theorem IV is complete. 


11. The method of Zygmund which led in §10 to (14) and (15) may 
be used to prove certain additional theorems, examples of which are given here. 
Let ymn be real numbers such that ynn—1 as m— o for every n and 
co 
let Sn denote the sum Sm = 3% ymnfn, if it is convergent. 
n=1 
(16) If a sequence of constants by exists such that the sequence {Sm— bm} 
is bounded on a seb OC of positive measure, then there exist constants dn such 
that %(fn— Gn) is convergent almost everywhere on X, cf. [38], p. 97 and 
p. 100, 


First, one may suppose that ymn = 0 for every fixed m and n > N = Nn. 
This is obvious, since the series defining Sm are, by Egoroff’s theorem, uni- 
formly convergent on a subset of C which has positive measure. 

Next, using the argument in the second part of the proof of (15), one 
sees that it is sufficient to consider the case in which the distribution of each 


y 
re 
h 
18 
mn 
1, 
at | 
ce 
ts 
¢, 
or 
m 
li- 


432 E. R. VAN KAMPEN. 


fn on X, is symmetric, choosing bm 0 for every m and dn = 0 for every n, 
And the first part of the proof of (15) may be used to reduce the proof of 
(16) to the case where each fy is of the form ¢nrn, cf. (14) and I (9). 

Furthermore, on omitting a finite number of terms of the series Senry, 
one may suppose that the measure of the set on which the sequence {S,} 
corresponding to ¢n?n is bounded, is larger than 2-4. Finally, application of 
the method used in the proof of (14) establishes an upper bound for the 
sequence of numbers 


Since ymn—>1 as n—> e for every m, this implies the convergence of %c,°, 
hence the convergence almost everywhere of 3¢nrn. This completes the proof 
of (16). 

The generality of the summation method in (16) allows a large number 
of applications. However the next statement may not be obtained from (16), 
since the set C” is allowed to depend on m. The notation S,, introduced at 
the beginning of this §11 is used again. 


(17) For every « > 0, let there exist a constant M = M, and, sels C™ on X, 
such that pC™ > 1—eand that | Sn| <M holds on C™ for every m. Then 
there exists a sequence of constants dn, such that %(fn—4n) 1s convergent 
almost everywhere on X. 


The proof of (17) is not essentially distinct from the proof of (16). The 
measure of the set on which the method of (14) is applied may be selected 
larger than 2-4 by choosing « sufficiently small. The case where Sj is selected 
to be the partial sum sm of fn will be used in § 19. 

A comparison of (16) and (17) naturally suggests the truth of the 
following statement, of which no proof is known: 


If 0<a<1,M>0 and if there exists a sequence of constants Vm and 
t a sequence of sets C™ in X such that | Sm— bm |< M on O™ and pl" >a 
| for every m, then there exists a sequence of constants am such that %(fn—4) 
is convergent almost everywhere on X. 


In the particular case where each Sm is a partial sum of the series 3fr, 
a proof may be obtained by comparing Theorem V of Part II and Theorem J 
of [15]. An analysis of the proof of this last theorem might lead to a proof ol 
the above conjecture. 


Nm 
i 2n 2 
Ymn'Cn’. j 
n=1 
q 
4 


d 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 433 


PART III. Series of Independent Functions. 


Convergence criteria of series of independent functions on [0,1] in the 
sense of Kolmogoroff or, equivalently, in the sense of Steinhaus (cf. [20] and 
[11]), may be derived immediately from the convergence criteria of Part II 
for Sfn on X —ILX,, where each fn is a function on X,. It is immaterial 
whether the independent functions are defined on the interval [0,1] or on any 
set Z carrying a measure v such that vZ — 1. 

12. Let {gn(z)}, {g’n(z)} be two sequences of real valued measurable 
functions defined on spaces Z, Z’ carrying abstract Lebesgue measures v, Y 
respectively. These sequences are said to be equimeasurable if for any finite 
uumber of Borel sets 0;,°: + +, of real numbers, one has vC = vC’, where 
(,C’ are the sets defined by 


C: C On, COn, (n= 1,---, 
The functions gn(z) on Z are said to be independent on Z if for 


k 
any finite number of Borel sets Q),---,Q, one has vV’ =ILvCy. Here Cn 
n=1 


and C are defined by the inequalities: 
Ca: C: gu(z) CQ, for n=—1,-- -,k. 


The proofs of the following statements are evident, cf. e.g., [24]. 

If {gn} and {g’,} are equimeasurable sequences, then the functions obtained 
from {g,} and {g’,} by any limiting process (reducible to convergence in 
measure) are equimeasurable functions. 

If {fn} is a sequence of functions obtained on X —ILY, by the con- 
vention I (7), then {fn} is a sequence of independent functions on X. 

If the sequences {fn}, {gn} are independent on X,Z respectively, then 
these sequences are equimeasurable, if and only if the functions fn and gn are 
equimeasurable for every n. 

13. Now let {gn(z)} be a sequence of independent functions on a space Z, 
which carries a measure v for which vZ =1. For every n, let Xn be the space 
of real numbers. Let pm be defined on Vn by the definition pn(Qn) =v(Cn), 
where ©, is any Borel set in X» and Cy is the z-set Cn = [gn(z) C Qn]. Let 
a function fn(a») be defined on XY, by placing fn(an) =n for every (real 
number) a, in X,. From these definitions it is clear that gn and fn are 
equimeasurable functions. Now let the measure p = Ip, be introduced on 
the product space X —IIX,. Then the statements of §12 imply that the 
sequences {f,} on X and {gn} on Z are equimeasurable sequences of functions. 
Thus one obtains the following theorem: 


of 
of 

1e 
f 
), 

| 

e 4 
d 

' 


434 E. R. VAN KAMPEN. 


(1) If {gn} ts a given sequence of v-measurable independent functions on a 
space Z for which vZ =1, then there exists a sequence {fn} of »-measurable 
functions, defined on a product space X =X» by means of the convention 
1 (7), such that {gn} and {fn} are equimeasurable sequences. 


Thus, according to §12, the criteria of Part II for different types of 
convergence of series of the type of Sfn, retain their validity if Xfn is replaced 
by =gn, where the gn form a sequence of independent functions on a space Z 
of total y-measure 1. 

As another application, note that as a consequence of (1), Theorem J, 
§ 2 of [23] is a very special case of (16.1) on p. 280 in [9]. The more special 
character of the former, has as one of its consequences, that the former does 
while the latter does not allow a direct generalization to the case p = 1, cf. [24]. 


14, This § 14 contains the negative answer to a question of Kac and 
Steinhaus concerning the relation between completeness and independence, 
cf. [30], § 6. 

If the functions g» on Z are bounded, then clearly the sequence of powers 
fn”, is complete on Xn, so that the set of all monomials in the fr 
is complete on X = ILXY,. It cannot be concluded that the set of all monomials 
in the gn is complete on Z, or even that if this set is not complete, then the 
set of independent functions gn on Z can be enlarged by a non-constant func- 
tion. In fact, if g(x) is defined on [0,1] by g(x) = 2424 or 1 — 22(1 —2)}, 
according as or $< then the sequence g°, g', 9’, 
is not complete on [0,1], and if f is a function on [0,1] such that f and g 
are independent, then f is a constant almost everywhere. If f is not constant 
and f and g are independent, let A be a set defined by A: f(x) < o, where 
w is selected in such a way that the measure of A is neither 0 nor 1. A con- 
tradiction is now easily obtained by applying the fundamental theorem of the 
calculus to the characteristic function of A. 

Closely related is the remark that a sequence of independent functions gn 
(or even of their powers (gn)”) on [0,1] is never complete. In fact, if fn 18 
the function on Y, corresponding to gn, and k l, then fi.f1 cannot be approxi- 
mated by linear combinations of the (f,)" unless either f;, or f; is constant. 


PART IV. 


In this part certain convergence criteria for infinite convolutions will be 
derived from the corresponding criteria in Part II. As soon as the connectiol 
between the theory of infinite convolutions and the theory of the series i2 
Part II has been established, one may transcribe these theorems without 


(3 
‘ 
( 
‘ 
1 
+ 
{ 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 435 


further proof (although theorems of Part II and Part IV are never equiva- 
lent). This explains the list of theorems without proofs in § 20 and § 21. 


15. A distribution function o=o(t),— «0 <t< + o, is defined to 
be a monotone function such that o(— «) —0, o(+ ©) =1. Two distri- 
bution functions will be considered as identical if they are equal at their 
common continuity points. Thus two distribution functions are identical if 
they are equal at a dense set of values of ¢. A sequence {on} of distribution 
functions is said to converge if there exists a distribution function o, such 
that on(t) -o(¢) holds at every continuity point of o. Note that {on} is 
not considered to be convergent if on(t) a(t) holds for every ¢ but «(t) 
is not a distribution function. If on,o are distribution functions, then ono 
obviously holds if on(¢) —o(t) for a dense set of values of ¢. The following 
criterion for the convergence of a sequence {on} of distribution functions is 
not quite obvious. 

(1) The sequence {on} is not convergent if and only if there exists an a> 0 
and for every n, an m > n such that | om(l’) —on(t’)| > aholds for every 
and t” in at least one interval of length a. 


That the existence of such an a is not compatible with the convergence 
of {on} is obvious. Thus, it remains to prove that if such an a does not exist, 
then {on} is convergent. 

By a theorem of Helly, a subsequence of {on} may be selected which tends 
everywhere to a monotone function a(t). If ¢) is a continuity point of «, and 
«> 0 is arbitrary, let 0 be chosen such that | a(to + 28) — a(t) | 
By assumption, there exists an n = m5 such that for every m > n = ne, the 
holds for at least one set of values on 


inequality | o,,(t’) —on(t”) 
every interval of length 6. 

Let the element on, m > n = 76, of the subsequence which determined « 
be such that | a(t) + 28) —om(to + 28)| < «, so that | om(to + 28) — a(to) | 
<2. Hence, the definition of n = 5 implies first | on(to + 8) — a(to) | < 3e, 
and then | op(lo) — a(to)| < for every p >n= Thus, on(to) > (to) 
at every continuity point of #(¢). Finally, « must be a distribution function. 
In fact, since a(+ 0) and a(— oo) exist, the condition which was used 
above to determine 8 as a function of e« and ty, may be satisfied by the same 
§>0 for a fixed «>0 and every ¢) which is sufficiently large. Since 
0) —1 and o,(— ©) —0 for every p, this implies that «(+ 0) =1 
and a(— o) 0; so that a is a distribution function, and the proof of (1) 
is complete. 

A distribution function o—o(t) determines uniquely a Lebesgue- 
Stieltjes measure on — 0 <t< +o. This measure, which will also be 


a 
le 
n 
of 
Z 
I, 
ag 
n 
8 
e 
q 
t 
| 


436 E. R. VAN KAMPEN. 


denoted by «, may be obtained by a well-known extension, from the definition 
oA =o(b) —o(a), where A is the t-set a= <b and a,b are continuity 
points of o. Thus of denotes the o-measure of the point ¢, i.e., the jump of 
a(t) at t. The integral of an integrable function g(t) with respect to this 
measure o is the Lebesgue-Stieltjes integral 


J 


The spectrum of o is defined to be the set of those values of ¢ for which 
|o(t —o(t—e)| >0 holds for every «>0. The point spectrum of 
is the set of those ¢ for which of > 0, i.e., the set of discontinuity points of o. 


16. If o, and o2 are two distribution functions, then the convolution 

0, * o2 of o; and az is defined by 
+00 
-f ff do, do.(s) 
rte<t 

where the double Lebesgue-Stieltjes integral on the right represents the 
measure of the (1r,s)-set [r+ s < ¢], if in the (r,s)-plane the product of 
the two measures o; and o2 is used. From this second expression for 9, * 0 
the following statement is clear. 


(2) The function o, * 02 is a distribution function and the point spectrum 
(spectrum) of o, *o2 may be obtained by adding arbitrary elements of the 
point spectra (spectra) of 0, and oz (and forming the closure). Moreover, 
+00 
+0) -f o,(t—s+0)do2(s) and & o28. 
r+s=t 

Using Fubini’s theorem, one sees that the convolution of any finite number 
of distribution functions satisfies the commutative and associative laws. In 
what follows 0, 7, on, tm always denote distribution functions. 


(3) If on—>o and as ©, then on* tra 0 * 7, 


It is sufficient to show that on * tn (to) —o * tn(to) 20, as n—> %, at 
every continuity point t) of o*7. In fact, one obtains the statement 
* —o *7(to) on replacing o, on, tm by 7, tn, o, and the two 
together imply (3). 

If « > 0 is given, let r,,- - -, 7p be a finite number of discontinuity points 
of o(t) which are such that the sum of the jumps at all remaining discon- 
tinuity points of o(t) is less than «. Since, by (2), the p numbers t) —" 


( 
ov 
N 
fo 
th 
in 
tr 
if 
| as 
(4 
of 
> 
n 
wl 
las 
fo 
{r 
ass 
to 
on 
(5 
bo 
Th 
| 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 437 


(k= -+,p) are continuity points of r(¢), one may determine the non- 
overlapping intervals J, : a, <t < by, in such a way that 3[7(b..) —r(ax) | 
<«, where ay < ty) — 1% < bx, and the a, are continuity points of 7(t). 
Next, one can determine M in such a way that 3:[tn(bx) —7m(ax)] < 2 
for n > M. On the other hand, one may determine N > M in such a way 
that | on(t) —o(t)| < 2e holds for every n > N and for every ¢ which is not 
in any of the intervals 1) —by << t < t) — a. Thus, on separating the con- 
tributions of the intervals J; and of the rest of the integration domain, one has 


| on * tn(to)—o* tn(to) | | [on (to —s)—a(to —S) ]dtn(s) | Re + Re, 


ifn > N (>); so that the proof of (3) is complete. 
Let » denote the distribution function for which w(t) = 0 or 1 according 
< or t > 0. 


(4) If on, 7, are such that on * tro AS then mo. 


If e«>0 is given and t,—t are continuity points of o for which 
—o(—-t) > 1—e, then N may be chosen such that on(t) — on(—t) 
>1—2e and on* tr(t) —on* >1—2% for Thus, for 


n> WN, 


2t 


+00 
2t -2t 
S Re + (2L) — tr(— 24), 
where the variation of 7, in the first two integrals and the integrand of the 
last integral have been majorized by 1. Since tn(2t) —tm(—2t) > 1—4e 
for any e and a suitable t= &, one can find a subsequence of the sequence 
{tm} which tends to a distribution function 7+. From (3) and the first 
assumption of (4) one sees that the corresponding subsequence of on * tn tends 
too*7;s0 that c*r—o. Now if t»— did not hold, one could select 7 to 


on any interval consisting of negative t-values. Thus 7 =o, as stated in (4). 
(5) Ifo*r—o then r=—o. 


Let t) > 0 be given; the maximum m of o(¢ + t) +0) —o(t— 0) for 
—«<t<4+ » exists and is positive, and is attained at every point of a 
bounded closed ¢-set, Ay. Let t,, t2 be the least and largest values of ¢ in Ao. 
Then (2) implies that 


f [o(t; + to —s +0)—o(t; —s —0) ]dr(s) =m. 


| 


438 E. R. VAN KAMPEN. 


Since the total variation of 7 is 1, and since 0 So(¢t+ t+ 0) —o(t—0) 
= m, this is possible only if r(s) has the variation 0 on any interval on which 
a(t, + t,—s+0) —o(t,—s—0) < holds everywhere. In particular, 
the total variation of + must be 0 on any interval consisting only of positive 
t-values. Using f, instead of ¢,, one sees that + also has the total variation 0 
on any interval consisting of negative ¢-values. Thus 7 =o, as stated in (5). 

17. If {on} is a given sequence of distribution functions, let o,, be the 


convolution 


> 


n 
on = 01 * g. 
k=1 


and let the infinite convolution of the on be defined formally as 
The infinite convolution vo, is said to be convergent if there exists a dis- 


tribution function o such that ¢,., — o, in which case one writes o = Yop. On 
denoting by on.m, where n < m, the distribution functions 


On, m = Ons * Onsg * Om 


and supposing that o = ¥o, is convergent, one sees that not only o.n—<¢ as 
n, —> © but also o,n * on,m =o.m, no matter how m > n depends on n. Thus, 
by (5), as n—> ©, for arbitrary m= mn. This proves one-half of 
the following Cauchy criterion for convergence of % on: 


(6) The infinite convolution ¥ on is convergent if and only tf on.m—>o as 
n—>oc no matter how m > n depends on n; cf. [10], Theorem 1. 


In order to prove the second half of (6), let it be assumed that the 
sequence g,», is not convergent, so that, by (2), there exists an a > 0, such that, 
for every n, there is an m > n for which | ¢,m(t’) —o.n(t”)| > a holds for every 
t’, t” in at least one t-interval of length 3a. By the assumption that on.m— », 
one may select this m and n in such a way that | on.m(+ 4) —on.m(—4)| 
>1—a. Now, if —a, ¢ +a are continuity points of and o.m in 
the above mentioned interval of length 3a, one has 


a 


o= = and (1—a)on(—a) Sol 


so that (1—a)o.n(t’—a) Son(t’) Son(U +a) A contradiction is | 


( 
— 
i 

0 

f 
d 
i 

d 

t 
d 

a 

f 

f 

+4 -a 
where 
e 

+0 ( 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 439 


now obtained, since there must exist a ¢” between ¢’ —a and ¢ +a for which 
| o.m(t’) —o.n(t”)| Sa, while | o.m(t’) —o.n(t’)| >a for any such ¢”. 

If o—*%*on is convergent, it follows from (6) that, on placing 
On, = * ** SO that c—o.n * on., One has on, aS N—> ©. 


(6bis) If %on is convergent, then the (topological) limit of the spectrum 
of o.n exists and is the spectrum of % on; cf. [10], Theorem 8. 


This means that if for every « > 0, the interval J, : th —e<t<t+te 
contains points of the spectrum of o,» for infinitely many n, then, for every «, 
the interval /, contains points of the spectrum of o,» for all but a finite number 
of n and fy is in the spectrum of o. It is sufficient to prove that to is in the 
spectrum of o, since the other part of the statement follows then from a.» — o. 
Thus it is sufficient to prove that, for every « >0, one has o(to + 2¢) 
—a(lo— 2e) > 0. 

Let n be chosen such that J, contains a point of the spectrum of a.» 
such that o.n(to + €) —o.n(to—e) > 0, and let n be so large that 
on.(€) —on,(—e) >0. The proof is now evident, since one obtains easily 
from o * on. that 


o(to + —a(ty — 2c) 
[o.n(to e) —a.n(to —e) |fon.(+ e) — on.(—e) ] > 0. 


Thus, by means of (2) and (6 bis), the spectrum of may be 
determined from the spectra of the on. 


18. Let f(x) be a p-measurable function on a space X, which carries a 
measure such that »X —1. Then the distribution function of f(a) 
is defined by f(t) = At, where A: is the x-set Clearly the dis- 
continuity points of o(¢) are the values of ¢ at which »B: 0, where By is 
defined by f(x) =¢. Moreover, o(f —0) and o(f +0) =pAt+ 

On the other hand, if a distribution function o(/) is given, then a func- 
tion f(z) may be defined on a suitable space \ such that o(/) is the 
distribution function of f(a). In fact, let X be the open interval O<a<1 
and let » be the Lebesgue measure on this interval. Then the “ inverse 
function ” of «(¢), which may be defined on 0 < « < 1 has as its distribution 
function. 

If f is y-measurable on XY, and o(¢) is its distribution function, one can 
express certain p-integrals on X in terms of corresponding o-integrals and 
conversely, as follows: 


One has 
g(f(a))aX = g(t)do(t), 
J J 


if at least one of the integrals exists. 


440 E. R. VAN KAMPEN. 


19. Suppose that fn(an) is pm-measurable on the space Xn, where 
PnXm = 1, and that the functions f, are considered, according to I (7), as 
p-measurable functions fn(z) on the product space X = ILXy, where p = Ip». 
Let on(t) be the distribution function of fn(t). 

The distribution function of f,; + f. on VY is o, *o.(t). In fact, if A; 
is the X-set [f:(z) + f.(x) < #], one has by I (8) and by the definitions of 


o,(t),o2(¢) as distribution functions of f;, fz: 


f ff do,(r)do,(8) =o, *o2(t —0). 
r+s<t 
By an easy induction argument one obtains the proof of the following 
statement (if use is made of notations introduced in §17) : 


(8) The distribution function of fi +fe-+-+-+fn on X is on and the 
distribution function of 4. fun on = 1s On.me 


The connection between Part II and the theory of infinite convolutions 
may be formulated as follows: 


THEOREM V. The sequence Xf, is almost everywhere convergent on 
X=IIX, if and only if the infinite convolution *%on of the distribution 
functions on of the fn is convergent, in which case o = % oy is the distribution 
function of Xfn; cf. [10], Theorem 32. 


In fact, by (6), %on is convergent if and only if on.mn—2o as n>, 
for arbitrary m =m» > n. Now it is clear from (8) and from the definition 
of the distribution function of a function, that the condition on. m—>o Is 
satisfied if and only if the sequence of functions gn(2) =fn(x) +: fm(2) 
tends in measure to 0 on \, as n— ©, for arbitrary m =m, >n. This is 
the case if and only if the series Sf, is convergent in measure on XY. Finally, 
by Theorem IV, Sf, is convergent in measure on X if and only if Sfn is con- 
vergent almost everywhere on XY. The last statement of Theorem V is evident, 
since the definitions clearly imply that if s, tends on X in measure to s, then 
™—T, Where are the distribution functions of respectively. 

Another proof of Theorem V, which now follows, does not make use of 
the considerations of §16 and §17. In fact, it may be used to give a second proof 
of the statement (6) of § 17. One half of Theorem V is obvious. In fact, if Sfn 
is convergent almost everywhere on X, then fn is convergent in measure on 
X, so that {on} is a convergent sequence of distribution functions and 
o = limo.» = on is the distribution function of Now, suppose that om 
is convergent, so that {o.n} is a convergent sequence of distribution functions. 
If «> 0 is given, let M be so large that o.n(M) —o.n(—M) >1—€ 


e 
| 
t 
t 
| j 
1 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 441 


for every n. In the space Y = ILY,, let C” be the set O% = [| s,(x)| SM]. 
Since, by (8), o» is the distribution function of sn =f; -+-- fn, one 
has pC" = o.»(M) —on(—M) >1—«c. On selecting Sm in II (17) to be 
the partial sum s” of Sf", one sees that 3(fn— an) is convergent almost 
everywhere on \ for a suitable choice of the a,. Thus, by the half of Theorem V 
which has already been proved, ¥on(t—4n) is convergent. Since also on 
is convergent, the sequence Xa, is convergent. And finally, Sf, is convergent 
almost everywhere on X. This completes the second proof of Theorem V. 


20. It is clear from (7) that if the k-th moment Ni.(on) of on is 
defined by 
f 


then Mx (fn) = Nx(on), if at least one of these moments exists. Moreover, if 
+00 
Ne(on) (on) }*don(t) = Na(on) — Ni (on)?, 
then Mo(fn) =N2(on). Similarly, if A,» denotes the zn-set [| fn(an)| SK], 
then 


pn( Xin — An) = a(— + 1 —o(K + 0); 
and 


9 (fn) dXm 9(t)don(t) 


if at least one of the integrals exists. 

As a consequence of Theorem V and the above remarks, one can write the 
convergence criteria of Part II for 3f, as convergence criteria for on. 
Accordingly, in the following list of theorems, the proofs are represented by 
references to statements in Part II or to preceding theorems in the same list. 


THEOREM VI (Three series theorem). The infinite convolution % on is 
convergent if and only tf, for a fived K > 0, the following three series are 
convergent : 


K K K 

-K J-K 

in which case the same holds for every K >0. [Cf. Theorem I.] 


The shortest proof of Theorem VI may apparently be given by noting 
first that the convergence of Yon is equivalent with the convergence in meas- 
ure of Sf, (using (8) and (6)), then applying IT (2), (which obviously 


14 


| 


442 E. R. VAN KAMPEN. 


retains its validity in the case of convergence in measure), thus reducing the 
proof of Theorem VI to the proof of (10), which may be obtained from the 
theory of Fourier-Stieltjes transforms; cf. [10], § 4 and Theorem 34. 

Proofs of (9)-(11) may be obtained either from Theorem VI or from 
the corresponding statements in Part II and also from the theory of Fourier- 
Stieltjes transforms of distribution functions. 

(9) If N.(on) exists for every n and if XN.(on) is convergent, then o = Xo, 
exists if and only if 3N,(on) is convergent, in which case N,(o) = 3N,(on) 
and =N (on). [ Cf. I] (4) ]. 

(10) If on(—K) =0, on(K) =1 for every n, then o = % on exists if and 
only if both series 3N,(on) and 3N.(on) are convergent. [Cf. II (6) and 
II (4).] 

(11) The infinite convolution *% o, is convergent if so are the series 


+00 


tdon(t) and | |? don(t) 
-00 -00 


for a fixed p, 1 <pS2. [Cf. ID (7).] 
The necessary condition for the convergence of fn in §9 appears now 
in the following form 
(12) Ifq<p,¢>0 and if for every n, on(t) = 1—on(—t) and 
+00 +00 
f | |¢don(t) Se f | |? don(t), 
©) 


then the convergence of %on implies that 
+00 
3( | (r= Max 2) 
-00 


is convergent. [Cf. II (12).] 


21. The infinite convolution Yo, is said to be absolutely convergent if 
it is convergent and remains convergent on arbitrary permutation of the on 
In view of Theorem V, this is the case if and only if Sf, is almost everywhere 
convergent no matter in what fixed order the terms are taken. Thus one 
obtains Theorem VII (which may also be obtained as an immediate consequence 
of Theorem II), and (13). 


TuroreM VII. The infinite convolution on is absolutely convergent 
if and only if, for a fixed K > 0, the three series 


a 


. 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 443 


7K K 


are convergent, in which case the same holds for every K > 0, and the infinite 
convolution is independent of the order of the terms.  [Cf. 
Theorem IT. ] 


(13) The infinite convolution ¥ o, is absolutely convergent if the two series 


+00 +00 
tdon(1)| and 
-00 -00 


are convergent for a fixed p, 1 <pX2. [Cf. Theorem VII or II (7) and 
Theorem IT. ] 

From the remark following Theorem II, one obtains the following 
statement : 

(14) If von is convergent, then there exists a sequence {dn} of constants 
such that % on(t—<dn) is absolutely convergent. 

In the theory of infinite convolutions it is hard to distinguish between 
the two types of convergence discussed in §7% and §8. Thus the criteria 
corresponding to those of § 8 are listed here as criteria for absolute convergence 
of W On. For (16), cf. [34]. 

(15) The infinite convolution ¥ an is absolutely convergent if for a fixed 
K > 0 the two series 


K 
4(1+- o,(—K) —o,(K)) and don(t) 


are convergent, in which case the same holds for every K >0. [Cf. Theorem 
ITT.] 


(16) The infinite convolution vo, is absolutely convergent if the series 


> | t |? don(t) 


is convergent for a fired p,O< pi. [Cf. If (11).] 


22. Let % represent a class of Borel sets on the infinite t-axis which 
class ig invariant under translations of the ¢-axis, and which includes, along 
with any sequence of sets An, the set 3An. 

Such classes are, for instance, the class 2’ of all enumerable sets; the 
class 1” of all Borel sets which are 0-sets in the ordinary sense; the class of 


all 0-sets according to any Hausdorff measure. 


444 E. R. VAN KAMPEN. 


A distribution function o(¢) will be called pure if it has, with reference 
to every class %, the following property: If the o-measure of one set in an % 
is not 0, then the o-measure of some set in this 9 is 1. If, for instance, a is 
pure and has a discontinuity point, then the o-measure of a set in Y’ is not 0, 
so that, according to the definition the o-measure of some enumerable set is 1. 
Such a distribution function will be called purely discontinuous. It is easy 
to prove that o is pure if the o-measure of every set in Y” is 0, i.e., if o is 
absolutely continuous. On the other hand, although a continuous, but not 
absolutely continuous, pure distribution function o is always singular, a 
singular distribution function need not be pure. These notions allow the 
formulation of the following theorem. 


THEOREM VIII (Pure Theorem). If the on are purely discontinuous and 
o = Won converges, then o is a pure distribution function. 


Let fn on Xn be, for every n, a function which has o, as distribution func- 
tion. Since % a, is convergent, the series Sf, is convergent almost everywhere 
on Y = IL\,. And since each on is purely discontinuous, the function fn may 
be assumed to attain an at most enumerable set of distinct values on \y. Let 
M denote the (enumerable) modul of values of ¢ generated by the values taken 
by all f,, and if A is any t-set, let M(-+-)A denote the set formed by the sums 
of any element in M and any element in A. It is clear that-if A belongs to a 
class then so does M(+) A. 

Now suppose that the set A of the class {is such that the o-measure oA 
of A is positive, i.e. that f =f, takes values in A on a set D of positive 
measure in \ —ILY,. If C denotes the subset of X, where f takes values 
in A’ = M(+)A, then clearly C satisfies the requirements of I (6), so that 
the measure of C is either 0 or 1. Since C > D and the measure of D is 
positive, this implies that the measure of C is 1. Finally, the measure of C 
in X is equal to the o-measure of the t-set A’ = M(-+)4A, so that oA’ =1. 
This completes the proof that o = %o, is a pure distribution function. 

In view of the remarks at the beginning of this Section, Theorem VIII 
immediately implies the following statement: | 


(17) If o—Won is a convergent infinite convolution of purely continuous 
distribution functions on, then o is either purely discontinuous or singular or 
absolutely continuous; cf. [10], Theorem 35. 


23. It seems to be very difficult to decide in general which of the cases 
of Theorem VIII take place for a given convergent infinite convolution Wo 
ef purely discontinuous distribution functions o,. In other words, no general 
rule is known, to decide for a given class 9 whether or not the o-measure of 


Ww 


q 


WI 


| | 
d 
( 
I 
tl 
| I 
od 
( 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 445 


each ¢-set in { is 0 or not. In this section a proof will be given of a theorem 
of P. Lévy, which gives the decision in case of the class Y’ of § 22. Note that, 
according to § 15, the expression ot denotes, for a distribution function o and 
a real number ?¢, the value of the o-measure of ¢, i.e. the jump of o at ¢. 
The following lemma is related to the proof of Theorem VIII in [21] 
and to [15], Lemma 3. 
(18) If 0<d=1, 0<6e<d,1>0 and if the distribution functions 
A, wy v have the properties 


A=p*v, A(a+ 21) —A(a—2l) <d+e, v(l) —v(—l) >1l—e, 


while a = d holds for some value a of t, then there exist real numbers b and 


c=a—b, such that 


(i) d—«< pb < (ii) |c| <1; (iii) we 


If p, g, are the discontinuities of » and vy, then one obtains from the 


definition of a convolution, since A = p * vp: 
(*) d=a= (up) (vq). 
p+q=a 
In (*) the terms for which |a—p| <1 (or |q| <1) satisfy 3vqS 1, and 
the remaining terms satisfy < «, 3up1. Hence 


d<e-+ Max pp. 


la-p| 


If this maximum is reached at b, one obtains (ii) and the left half of (i). 


Next, one has 
2l) —A(a—2l) = — —t—2l) ]dv(t) 
= [(a+1) 
which implies the other half of (i) and in addition, using the left half of (i), 
a—p| the term p= 6, 


Separating now in (*) the terms where 
q=c, from the other terms, one finds 


= 
‘= 


which clearly implies the remaining inequality (iii) of (18). 
In the proof of Theorem IX, use will be made of the following statement 


(19) which is an immediate consequence of (18): 


446 E. R. VAN KAMPEN. 


(19) Jf A=pn* vn, for every n and pn—>d, so that vn by (4), and if 
for some a, one has 3a = d > 0, then there exist real numbers by, cn such that 
Dn + Cn = 4, bn A, pnbn d and 0, 1. 


The characterization of the purely discontinuous case of Theorem VIII 
or (17) may be formulated as follows (cf. [21], Theorem XIII, in the proof 
below use has been made of a letter from B. Jessen; ef. [3 bis], footnote '*). 


THEOREM IX. If o = orn is the convergent infinite convolution of the 
purely discontinuous distribution functions on, then o is purely discontinuous 
if and only if 
‘id IId, ~ 0, where dy, = Max ont. 

t 


The sufficiency of condition (**) for the existence of at least one dis- 
continuity of o is obvious, so that the corresponding statement of Theorem IX 
follows from Theorem VIII. It remains to prove that if o has at least one 
discontinuity point, then (**) is satisfied. 

Let o.n, on, denote the same convolutions as in §17. On applying (19) 
to the equation o = o.n * on, instead of to A = pn * An, One obtains a sequence 
of numbers c¢,, such that ¢,—90, on,¢n—1, as n— Thus, replacing 
on(t) by on(t — + one can suppose that = 0 for every n, i. e., that 
on.09—>1, which clearly implies o,0—>1. On omitting a finite number of 
terms, one may suppose that o.,0 > 4 for every n, and o0 > 4. Then: 


o20=d=o00>}; 1; ,0400; o,0>51, 


and also 


onpSi—d; & op 
Thus, from (2), 


n n n n n n 
0.09 & = II (1 — d)(1— oa"), 
k=1 


k=1 k=2 l=k+1 OAp=-q k=2 


so that 
n n n 
Ilo,0 + (1— d) (1 — II < II o,0 + 1—d. 
k=1 k=2 k=1 


On letting » tend to infinity, one obtains To,0 = 2d—1> 0, so that the 
proof of Theorem IX is complete. 


THE JOHNS HOPKINS UNIVERSITY. 


( 
1] 
12 
13 
14 
15 
16 
17 
18 

19 


. Denjoy, “Sur les variables pondérées multipliables de M. Cantelli, 


INFINITE PRODUCT MEASURES AND INFINITE CONVOLUTIONS. 447 


REFERENCES. 


Borel, “Les probabilités dénombrables et leurs applications arithmétique,” 
Rendiconti del circolo mathematico di Palermo, vol. 27 (1909), pp. 247-271. 


. J. Daniell, “ Integrals in an infinite number of dimensions,” Annals of Mathe- 


matics, vol. 20 (1919), pp. 281-288. 

” Comptes 
Rendus, vol. 196 (1933), pp. 1712-1714. 

P. Erdés and A. Wintner, ‘‘ Additive arithmetical functions and statistical 


independence,” American Journal of Mathematics, vol. 61 (1939), pp. 713-721. 


. Hartman, EE. R. van Kampen and Aurel Wintner, “ Asymptotic distributions and 


statistical independence,’ American Journal of Mathematics, vol. 61 (1939), 
pp. 477-486. 


1. K. Haviland, “On the inversion formula for Fourier-Stieltjes transforms in more 


than one dimension,” I and II, American Journal of Mathematics, vol. 57 
(1935), pp. 94-100 and pp. 382-388. 


. K. Haviland, “A note on a property of Fourier-Stieltjes transforms in more 


than one dimension,” American Journal of Mathematics, vol. 57 (1935), 


pp. 567-572. 


. K. Haviland, “On the momentum problem for distribution functions in more 


than one dimension,” I and IIT, American Journal of Mathematics, vol. 57 
(1935), pp. 562-568 and vol. 58 (1936), pp. 164-168. 


1. Hopf, “On causality, statistics and probability,” Journal of Mathematics and 


Physics, vol. 13 (1937), pp. 51-102. 


. Jessen, “The theory of integration in a space of an infinite number of dimen- 


sions,” Acta Mathematica, vol. 63 (1934), pp. 249-323. 


. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta function,” 


Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-88. 


. Kae, ‘Sur les fonetions indépendantes,” I, Studia Mathematica, vol. 6 (1936), 


pp. 46-58. 


. Kae and H. Steinhaus, “Sur les fonctions indépendantes,” IT, Studia Mathe- 


matica, vol. 6 (1936), pp. 59-66. 


. Kaczmarz and H. Steinhaus, “Le systéme orthogonal de M. Rademacher,” 


Studia Mathematica, vol. 2 (1930), pp. 231-247. 


. R. van Kampen and A. Wintner, “On bounded convolutions,” Bulletin of the 


American Mathematical Society, vol. 43 (1937), pp. 564-566. 


.R. van Kampen and A. Wintner, ‘‘ On divergent infinite convolutions,” American 


Journal of Mathematics, vol. 59 (1937), pp. 635-654. 


. Knopp, “ Mengentheoretische Behandlung einiger Probleme der diophantischen 


Approximationen und der transfiniten Wahrscheinlichkeiten,” Mathematische 
Annalen, vol. 95 (1925), pp. 409-426. 


. Khintchine and A. Kolmogoroff, “ Ueber Konvergenz von Reihen deren Glieder 


durch den Zufall bestimmt werden,” Recueil de la Société Mathématique de 
Moscou, vol. 32 (1925), pp. 668-677. 
Kolmogoroff, “Ueber die Summen durch den Zufall bestimmter zufilliger 


Groéssen,” Mathematische Annalen, vol. 99 (1928), pp. 309-319, vol. 102 
(1930), pp. 484-488. 
Kolmogoroff, ‘“ Allgemeine Masstheorie und Wahrscheinlichkeitsrechnung,” 


Comptes Rendus de VAcadémie Communiste (1929), pp. 8-21. 


2, 
3. A 
3 bis. 
0, 
12, 
3, S 
5, 
18. A, 
19, A, 


448 


20. 


21. 


22. 


23. 


24. 


26. 


38. 


E. R. VAN KAMPEN. 


A. Kolmogoroff, ‘‘ Grundbegriffe der Wahrscheinlichkeitsrechnung,” Ergebnisse der 
Mathematik und ihre Grenzgebiete, vol. 2, no. 3 (1933). 

P. Lévy, “Sur les séries dont les termes sont des variables éventuelles indépen- 
dentes,” Studia Mathematica, vol. 3 (1931), pp. 119-155. 

Z. Lomnicki and S. Ulam, “Sur la theorie de la mesure dans les espaces com- 
binatoires et son application au calcul des probabilité, I, Variables in- 
dépendantes,” Fundamenta Mathematica, vol. 23 (1939), pp. 237-278. 

J. Marcinkiewicz and A. Zygmund, “ Sur les fonctions indépendantes,”’ Fundamenta 
Mathematica, vol. 29 (1937), pp. 60-90. 

J. Marcinkiewicz and A. Zygmund, “ Quelques théoremes sur les fonctions in- 
dépendantes,” Studia Mathematica, vol. 7 (1938), pp. 104-120. 


. R, E. A. C. Paley and A. Zygmund, “On some series of functions,” Proceedings . 


of the Cambridge Philosophical Society, vol. 26 (1930), pp. 337-357 and 
458-474, vol. 28 (1932), pp. 190-205. 

H. Rademacher, “Uber die Verteilung gewisser konvergenzerzeugender Faktoren,” 
Mathematische Zeitschrift, vol. 11 (1921), pp. 276-288 and “ Einige Satze 
liber Reihen von allgemeinen Orthogonalfunktionen,” Mathematische Annalen, 
vol. 87 (1922), pp. 112-138. 


. F. Riesz, “ Untersuchungen tiber Systeme integrierbarer Funktionen, Mathematische 


Annalen, vol. 69 (1910), pp. 449-497. 


. H. Steinhaus, “Les probabilités dénombrables et leur rapport 4 la théorie de la 


mesure,” Fundamenta Mathematica, vol. 4 (1923), pp. 286-320. 
H. Steinhaus, Sur la probabilité de la convergence de séries,” Studia Mathematica, 
vol. 2 (1930), pp. 21-39. 


. H. Steinhaus, “La théorie et les applications des fonctions indépendantes au sens 


stochastique,” Actualités Scientifiques et Industrielles, Paris (1938). 


. C. Visser, “The law of nought-or-one in the theory of probability,” Studia Mathe- 


matica, vol. 7 (1938), pp. 143-149. 


. N. Wiener, “The homogeneous chaos,” American Journal of Mathematics, vol. 60 


(1938), pp. 897-936. 


. A. Wintner, “ Gaussian distribution functions and convergent infinite convolutions,” 


American Journal of Mathematics, vol. 57 (1935), pp. 821-826. 


. A. Wintner, “A note on the convergence of infinite convolutions,” American Journal 


of Mathematics, vol. 57 (1935), p. 839. 


5. A. Wintner, “On a class of Fourier transforms,’ American Journal of Mathematics, 


vol. 58 (1936), pp. 15-20. 


. A. Wintner, “On the densities of infinite convolutions,’ American Journal of 


Mathematics, vol. 59 (1937), pp. 376-378. 


. A. Wintner, “On the smoothness of infinite convolutions of the type occurring 


in the theory of the Riemann zeta-function,’ American Journal of Mathe- 
matics, vol. 61 (1939), pp. 231-236. 


A. Zygmund, “On the convergence of lacunary trigonometric series,” Fundamenta 


Mathematica, vol. 16 (1930), pp. 90-107. 


|_| 
= 
= 
= i 
| 
i 
= b 
p 
D 
28 
29. 
3 0 
t 
d 
( 
| 
| 
1 


K-CYCLIC ELEMENTS.* 
By J. W. T. Younes. 


The introduction, by G. T. Whyburn,? of the concept of cyclic element 
into the study of continuous curves proved so fruitful that it is only natural 
to hope for a further decomposition of a Peano space. In this connection 
Whyburn himself has defined cyclic elements of higher order where the space 
is a closed and bounded subset of Kuclidean n-space.* The approach is com- 
binatorial. Recently an attack has been made on the problem from a purely 
point set theoretic standpoint by D. W. Hall.t| The approach taken in this 
note is similar to that of Hall in that the space is a cyclicly connected Peano 
space and the decomposition is point set theoretic. 

To enlarge on this comment we shall survey, for a moment, the two 
equivalent notions of cyclic element from which the generalizations spring. 
In a Peano space a proper cyclic element My, consists of the totality of points 
which are conjugate to a point p which is not an end point or a cut point. 
A proper cyclic element M(a,b) may also be defined as the totality of points 
which are conjugate to each of two distinct points a and b which are conjugate 
to each other.° The generalization of Hall may be said to be from the first 
definition while that here presented is from the second. 

A large number of desirable properties are common to both concepts but 
it is also true that the generalizations do not seem to be all one might hope for. 
Sadly missing, for example, in any analogue to the theorem that the product 
of a cyclic element and a connected set is connected. On the other hand it is 


* Received August 18, 1939; Revised January 8, 1940. 

1The work on this paper was done while the author was in residence at Charlottes- 
ville. He wishes to take this opportunity to express his appreciation to Professor 
G. T. Whyburn and the Department of Mathematics at Virginia for their helpful 
cooperation. 

*G. T. Whyburn, “Cyclicly connected continuous curves,” Proceedings of the 
National Academy of Science, vol. 13 (1927), pp. 31-38; W. L. Ayres, “ On the structure 
of a plane continuous curve,” Proceedings of the National Academy of Science, vol. 13 
(1927), pp. 650-657; C. Kuratowski and G. T. Whyburn, Sur les éléments cycliques 
et leurs applications, Fundamenta Mathematicae, vol. 16 (1930), pp. 305-331. The 
reader is asked to consult this last paper for the terminology of the subject and for an 
extensive bibliography. 

*G. T. Whyburn, “ Cyclic elements of higher orders,” American Journal of Mathe- 
matics, vol. 56 (1934), pp. 133-146. 

‘The author has had the privilege of reading Hall’s contribution in manuscript. 

* For a proof see Kuratowski and Whyburn, loc. cit., p. 311. 

449 


J. W. T. YOUNGS. 


hoped that this note may serve as an indication of a general direction of 
approach which seems to be adequate for certain purposes. 

It will be noticed that the discussion is confined entirely to non-degenerate 
elements and the theorems proved are analagous to those true for non- 
degenerate cyclic elements in a Peano space. It follows that nothing is said 
about a “hyper-space,” for the non-degenerate elements will not cover the 
space. ‘These remarks, together with others in this note, give rise to problems 
which might be of some interest to the reader. 


Bi-conjugacy. 


1.1. The space (which we shall denote by the symbol 1) is a cyclicly con- 
nected Peano space, or we may consider it as a non-degenerate cyclic element 
of a Peano space. In general, small letters will denote points of the space, 
while capitals will be reserved for sets of points. 


1.2. <A point a is said to be bi-conjugate to a point b (notation: a2 ~ b) 
if for every pair of distinct points x, and x2 different from a and b it is 
true that in 1— (a, + 2,), Sa—=S>.° In general, a point a is said to be 
k-conjugate to a point b (notation: ak ~ b) if for every set of & distinct 
points different from a and 6 it is true that in 1 — (2, +--+: + 2x), 


Sa = S be 


1.3. We shall confine the discussion almost exclusively to bi-conjugacy ; never- 
theless, a large number of the statements will be equally true for k-conjugacy. 
Whenever a statement (with obvious modifications) is true for the more general 
concept we shall indicate this by prefacing the remark with an asterisk. In 
every instance the proofs for the general statements are obvious modifications 
of those offered for bi-conjugacy. 


1.4. The first three of the usual axioms for an equivalence are obviously 
satisfied for bi-conjugacy. That is: (1) a2 ~b or anon2~b, (2) a2~a, 
(3) a2~b implies b2 ~a. The transitivity property is absent but we do 
have a modification of it. 


*'T THEOREM. If then any pair of points 
| . 
separating a from b must conta a point of the set °°, In." 

Proof. Take any pair (p,q) distinct from a,2%,---,@n,b. Now 


°If we are considering 1— (x, +-.--+-+ ,), then by S, we shall mean the com- 
ponent of 1— which contains a. 

7 The author is indebted to W. L. Ayres for some valuable suggestions in connection 
with this theorem. 


450 

| 
‘ 

' 

| 


K-CYCLiC ELEMENTS. 451 


a2—~2, hence in 1—(p+q) we have Sa—Sz,. For the same reason, 
Se, = S2,° +, Hence and the pair (p,q) does not 
separate a from b. 


(For k-conjugacy the theorem will read: Ifak~a,k~-:-k~imk~b, 
then any set of & distinct points separating a from b must contain a point of 
the set °°, 4). 


*COROLLARY. If 02 2,2 a2%~ %A~ Ym 
2~b, and a2 ~ 2,2 ++ where the v’s, y’s, and 2’s constitute 
a set of distinct points; then a2 ~b8 


Bi-cyclic elements. 


2.1. The totality of points bi-conjugate to each of three distinet points which 
are bi-conjugate to each other constitutes a bi-cyclic element. That is, if a, b, c 
are distinct points, and a2 ~b, b2—~c, ¢ 2 ~a, then they generate a bi-cyclic 
element M(a,b,c) and xe M(a,b,c) if and only if e2~a,bandc. (The 
totality of points k-conjugate to each of (k + 1) distinct points which are 
k-conjugate to each other constitutes a k-cyclic element). 


2.2. A bi-cyclic element need not be connected; in fact, it may consist of 

exactly three points. Let the space be the points on the circumference of a 

circle together with those on three cords which form an inscribed triangle. 

If the vertices of the triangle are a, b and c, then M(a,b,c) =a+b-+c. 
However, we do have the following 


“2.3. THEOREM. Any point of a bi-cyclic element is bi-conjugate to any 
other pownt of it. 


Proof. If x,yeM(a, b,c) then a and y are both bi-conjugate to a, b and ¢ 
by 2.1, and so by 1.4 472~y. 


This gives a degree of homogeneity indicated below. 


“2.4. TuroreM. If the distinct points x, y, z are elements of M(a, b,c), 
then M(x, y,z) = M(a, b,c). 


Proof. The notation M(a,y,z) is justified by 2.3. If pe M(a,y,z), 
p is bi-conjugate to x, y, z, each of which is bi-conjugate to a. Therefore 
p2~a, by 1.4. Similarly p2~b,c. Therefore pe M(a,b,c). Hence 
M(z,y,2) C M(a,b,c). Similarly M(a, b,c) C M(z, y, z). 


* Throughout the paper a weaker form of this corollary will be used, the form in 
Which =m == 1. In a paper read at the April meeting of the American Mathe- 
matical Society in Chicago (1939), T. Radé called attention to the corresponding “ weak 
transitivity ” for conjugacy, thus the key to the situation here follows his remarks. 


| 
| 


J. 


W. T. YOUNGS. 


If contains three distinct points 
(u,v,w) then M(a,b,c) = M (za, y,z). 


Both are identical to M(u, v, w). 


*2.5. TuroremM. If =p+q, then the pair (p,q) 
cuts the space. 

Proof. Suppose the theorem false. Since a2 ~ 
and the pair (p,q) does not cut the space, a2~za by 1.4. Similarly, 
a2—~y,z. Therefore, ae M(az,y,z). By the same argument b, ce M(z, y, z) 
and by 2.4 M(a,b,c) = M(z, y, z). 


Sequences of bi-cyclic elements. 


3.1. In connection with the point set theoretic properties of M(a,b,c) we 
have seen that it may consist of exactly three points and so need not be con- 
nected. Another property is obtained from the following 


*“THEOREM. If bn—>b and bn, (n= 1, 2,3,-- then 
a2~b. 


Proof. Consider any (p,q) distinct from a and b. Take two regions® 
U(a),U(b) without common points,’® both excluding p and qg. There is an 
integer n such that dneU(a), bne U(b). Now in 1—(p+q), aaeS, 
bne Sp and Sa, —S>,. Therefore, Sa = Sv, and a2 ~ Db. 


*3.2. THEOREM. M(a,b,c) 1s closed. 


Proof. Suppose jn— p, pre M(a, b,c), (n +). We assert 
that p2—~a. The sequence a,a,a,-- converges to @, p and pn2 ~4, 
(n= 1,2,3,---). By 3.1, p2~a. Similarly p2~bDandc. Therefore 
peM(a,b,c). Hence M(a,b,c) is closed. 


*3.3. Lemma. If (1) U, V and W are three regions disjoint by pairs, 
(2) there are two sets of points (a,b,c) and (x,y,z) such that a,veU; 
b,yeV;c,zeW, (3) im each set, the points are bi-conjugate to each other; 
then it follows that any pair of the six points are bi-conjugate. 


Proof. It will suffice to show that x2~b. Take any (p,q) different 
from z and b. Some region U, V or W is entirely in 1—(p-+q). There 
is a point from each of the sets (a,b,c) and (a,y,z) in this region, hence 
S, and Sq both intersect this region and so are identical. 


® A region is understood to be a connected open set. The notation U(p) means 4 
region containing p. 
10 Tf there is no such pair of regions, then a = b and so a2~b. 


452 
| 
| 


K-CYCLIC ELEMENTS. 453 


*3.4. THEOREM. If {M (dn, bn, is a sequence of distinct bi-cyclic ele- 
ments, then lim M (dn, bn, Cn) cannot consist of more than two distinct points. 

Proof. If the theorem is false we may suppose that an—a, bn—> bd, 
(n> €, where a, b and ¢ are distinct points. Take disjoint regions U(a), 
U(b) and U(c). There is an integer mo such that for n > m%, are U(a), 
bneU(b), ne U(c). Now a2~b, b2~c, c2~a by 3.1, and by 3.3 
dn 2~ a,b,c. Therefore a,¢M(a,b,c). Similarly bn, M(a, b,c). There- 
fore M (an, bn, Cn) = M(a, b,c) ; that is, the M (dn, bn, Cn) are not all distinct. 


Using this theorem we may assert something stronger. 


*3.5. ‘THEOREM. There are at most a denumerable number of bi-cyclic 
elements. 

Proof. In each bi-cyclic element arbitrarily pick three distinct points. 
If for a certain bi-cyclic element the points are (a, b,c) then the element is 
M(a,b,c). Now with each element M (a,b,c) associate a positive real number 
y[M (a, = min [p(a, bd), p(b,c), Take any 8>0. If there 
are an infinite number of bi-cyclic elements with associated y greater than 64, 
then we may assume they are M(dn, bn, y[M (dn, bn, en) |] >8 > 90 and 
ln —> A, bn > b, Cn > c. Clearly a, b and ¢ are distinct. But it is impossible 
for lim M(dn, bn, én) to consist of three points. The theorem is now obvious. 


*CoroLLARY. The points of M(a,b,c) belonging to any other bi-cyclic 
element are denumerable. 

No other bi-cyclic element can have more than two points in common with 
M(a,b,c) by 2.4, and as there are only a denumerable number of bi-cyclic 
elements, the resuit follows. 


Components of 1— M(a, b,c). 


4.1, Since each bi-cyclic element is a closed set the complement is open and 
so the components of the complement are open as the space is locally connected. 


Lemma. If M(a,b,c) is a bi-cyclic element, and U(a), U(b), U(c) are 
disjoint regions, then no component S of 1—M(a,b,c) can have points in 
common with each of the three regions. 


Proof. If the statement is false then S + U(a) + U(b) is a region 
containing a and b, but not c. Hence there is an are from a to 6 in it and 
the are omits c.1* Some point z on this arc is in 8. In the region S + U(c) 


11(a,6) is the distance from a to b. 
12 This is a well known fact first demonstrated by R. L. Moore, “ Concerning con- 
tinuous curves in the plane,” Mathematische Zeitschrift, vol. 15 (1922), pp. 254-260. 


| 


454 J. W. T. YOUNGS. 


consider an arc from x to c. There is a last point y on this are which is also 
on the arc from a to b. Now yeS and there are three arcs connecting y to 
a, b and c, each pair of the ares having only y in common. We assert that 
y2~a. Take any pair (p,q) different from a and y. In 1—(p+q), 
S, contains at least one of the points a, b, c, and Sa contains the same one 
since a, b and ¢ are bi-conjugate to each other. Hence S, = Sua. 

Similarly y2~b,c. Therefore y« M (a,b,c) which is contradictory to 
the fact that ye 8. 

It is clear that this method of proof will not generalize even to tri-cyclic 
elements. As a matter of fact, for a cyclicly connected Peano ae the 
corresponding result is not true for tri-cyclic elements. 


4.2. THEOREM. The frontier points of S are in M(a, b,c) and there cannot 
be more than two of them. 


The proof is immediate from the lemma. 


4.3. It is a well known fact that every region truly in a cyclicly connected 
Peano space has at least two frontier points, thus it is easy to see that 


THEOREM. The frontier of a component S of 1—M (a,b,c) consists of 
precisely two points, and these two points (as a pair) cut the space. 


Remark. If the space 1 has the property that no set of (4 —1) points cuts 
it, then a component S of the complement of a k-cyclic element M (a, °°, ax) 
must have at least & frontier points. It might be conjectured that it will also 
have at most & frontier points, but no information seems available on this 
point. 

4.4. THroreM. If M is a bi-cyclic element and S is a component of 1—M 
having p and q for frontiers, then S contains at most one bi-cyclic element 
containing p and q. 

Proof. If M, and M, are two bi-cyclic elements in S containing p and 4 
then since § is a region we have an are in S from_m,e M, to mzeM.. It follows 
that m,2—~ and so M, = Mz. 


4.5. THroreM. If 8 is a component of 1—M then any bi-cyclic element N 
is such that NCS or N-S =0. 


Proof. By 2.3 any point of a bi-cyclic element is bi-conjugate to any 
other point of it. Thus by 4.3 if V-S~0, NCS. 


4.6. THEorEM. If {S,} is any sequence of components of the complement 
of a bi-cyclic element M(a, b,c), then d(Sn) > 0. 


| 
| 
j | 
| | 
| 
| 


K-CYCLIC ELEMENTS. 


Proof. If the theorem is false, then there is an infinite sequence of com- 
ponents with diameters greater than some positive 6. It is no restriction to 
assume that they are S», and (since they are connected) that there exist triples 
of points Xn, Yn, Zn€ Sn, = 1, 2, +), such that 2, Y, Zn 2, 
where x, y and z are distinct. It is clear that x, y,z¢ M(a, b,c). Take three 
disjoint regions U(a), U(y), U(z). There exists an integer n such that 
tneU(r), yne U(z). That is, a component S, of 1— M(a, b,c) 
has points in common with each of the three regions. This is contrary to 
Lemma 4. 1. 


CoroLiary. If pq there are only a finite number of bi-cyclic elements 
containing p and q. 


The proof follows from the above together with 4. 4. 


Continua of convergence. 


*5.1. TuEoREM. Lvery non-degenerate continuum of convergence is con- 
tained in some bi-cyclic element M(a, b,c). 


Proof. If C is a continuum of convergence then no finite set of points 
in C separates C in 1. Certainly no two points outside C separate C. Hence 
there exist three points a, b,c «C which are bi-conjugate to each other. Every 
point of C is bi-conjugate to a, b and c, therefore C C M(a, b,c). 

5.2. THrorEM. Suppose {Ky} is a sequence of continua with the following 

properties: (1) L=lim Ky contains at least three points, (2) L- Kn=0; 
1 
then there exists a bi-cyclic element M such that L =lim [Kn- M]. 

Proof. Since we are dealing with a compact metric space any sequence 
of sets contains a convergent subsequence and from this fact it follows that 
there is a bi-cyclic element Jf which contains lim Ky. As a matter of fact, 
the element in question is that of 5.1. The proof from this point on follows 
the pattern of the proof for a similar theorem concerning cyclic elements and 
so is omitted. 

The well known theorem alluded to above is used to show that the property 
of containing a continuum of convergence is cyclicly reducible. We cannot 
say that the property is bi-cyclicly reducible since in general the sets [Kn-M] 


will not be connected. 


5.3. Since a bi-cyclic element may not be connected (in fact, a bi-cyclic element 


** See Kuratowski and Whyburn, loc. cit., pp. 314-315. 


455 
| 


456 J. W. T. YOUNGS. 


can have a non-denumerable number of components each of the same diameter) 
we cannot hope for a generalization of the cyclic connectwity theorem within 
a bi-cyclic element, but a generalization is possible within the whole space. 
In fact, we have at our disposal in the literature the very tool adequate for 
this purpose. According to a result of Nobeling,’* if two closed sets in a Peano 
space cannot be separated by the omission of any set of & points then there are 
(k +1) ares connecting the sets, and the arcs are independent, except in that 
they might have the same end points when considered in pairs. 

If we take x and y two points of a k-cyclic element, this means that 
no set of & points separates x from y in the space and so there are (k + 1) 
independent arcs connecting x and y in the space. 


Conclusion. 


The reader will have noticed several unanswered questions by this time. 
We wish, in conclusion, to propose a further problem. One might arbitrarily 
call each single point of the space which is not in any bi-cyclic element a 
degenerate bi-cyclic element, or if a single point is in two or more bi-cyclic 
elements it too might be called a degenerate bi-cyclic element. In this fashion 
a space is covered by its bi-cyclic elements (degenerate and otherwise) so we 
can consider the hyper-space of bi-cyclic elements. One of the most beautiful 
results of the cyclic element theory is that the hyper-space is, in a sense, a 
dendrite. An analogous theorem for bi-cyclic elements would be extremely 
desirable. The fact that the frontier points of a component of the complement 
of a non-degenerate bi-cyclic element are two in number should play an 
important réle in the attack. 


THE OnIO STATE UNIVERSITY. 


14G, Ndbeling, “ Eine Verschirfung des n-Beinsatzes,” Fundamenta Mathematicae, 
vol, 18 (1932), pp. 23-38; N. E. Rutt, “ Concerning the cut points of a continuous curve 
when the are curve, AB, contains exactly N independent arcs,” American Journal of 
Mathematics, vol. 51 (1929), pp. 217-246. 


ti 
dc 


| 
2, 
3. 
4. 
6. 
if 
9. 
th 
je 
pe 
tir 
el 
th 
H el; 
| 
Wl 
| al 
i lin 
| 


THE POLYTOPE 2:,, WHOSE TWENTY-SEVEN VERTICES 
CORRESPOND TO THE LINES ON THE GENERAL 
CUBIC SURFACE.* 


By H. 8S. M. Coxerrr. 


CONTENTS 

PAGE 
1. Summary of properties of the 27 lines, 457 
2. An alternative derivation of the cycles, 459 
3. The representation by points in the affine plane, . ; ‘ ; . . 460 
4, The representation by points in real Euclidean six-space, . 464 
5. The representation by points in complex Euclidean three-space, . : . 468 
6. The 36 double-sixes and the polytope 1», . 470 


7. Collineations which generate [3?,?,1]’ according to a known abstract definition, 473 
8. The representation by lines in PG (3,4), . 
9. The 120 trihedral pairs and the polytope 4,,, : ; . 479 


§§ 1-4 show how simple and natural is Schoute’s representation of the 27 lines by 
the vertices of Gosset’s six-dimensional polytope, and how easily various plane pro- 
jections of the polytope can be drawn. In §§5 and 6 we find new codrdinates for this 
polytope, and for Elte’s related polytope 1. In § 7 we derive five quaternary collinea- 
tions (Table III) which generate the simple group of order 25920 in a particularly 
elegant manner. §8 connects these ideas with the lines on a special cubic surface in 
the finite geometry PG (3,4). Finally, in § 9, we show that the 240 vertices of Gosset’s 
eight-dimensional polytope 4,, lie by sixes in forty planes which correspond to the 
“non-isotropic ” planes of PG@ (3, 4). 


1. Summary of properties of the 27 lines. ‘To construct the configura- 
tion of the lines on the general cubic surface, it is usual to begin with a 
double-six 

Az Us Ug 


b; be bs b4 be, 


Where (,, 43. U4, 5,4 are five skew lines having a common transversal b,, 
and so on.2- The remaining fifteen lines are C12, ¢13,° * *, Where is the 
line of intersection of the planes a,b., a,b,. It is then found that c,. intersects 
Cs, but is skew to c.,, and that the lines form thirty-five other double-sixes, 


ne 


* Received September 15, 1939. 
* Presented to the Society in two parts: April 16, 1938 and September 7, 1939. 
*Schlafli, 22. 


457 


458 H. S. M. COXETER. 


each of which could have been used just as well to build up the configuration. 
Every line intersects ten others, which form five intersecting pairs. The 
cubic surface therefore has 45 tritangent planes, each containing three of 
the lines. 

The incidences of the lines are unchanged by the transposition of sullix 
numbers 1 and 2, which can be thought of as a re-naming of certain lines, 
namely the interchange of rows of the double-six 

by Cog Co4 Co5 C26 
be C13 C14 C15 Cie. 


The transpositions (12), (23), (34), (45), (56), which we shall denote 
by P,, P, O, N, Ni, generate the symmetric group on the six suffix numbers. 
But this is not the whole group of automorphisms of the configuration. The 
operator,® say Q, which interchanges rows of the double-six 


C23 C13 Cig Aq 


be bs Cs6 Cae Cas 


increases the order from 6! to 72-6! == 51840; in fact, it enables us to replace 
a,,° * *, 4, by either row of any one of the 36 double-sixes. 

These six operators generate the group in a particularly simple form. 
Their product in any order is of period twelve; °* e. g., in the order V,NOPP,Q 


it is 


R (123456)-Q 
(a, Mz C56 Cie b4 Coz ag) 


(C45 C46 C15 C26 De Cro Oy Cig Cos (ts) (Cr4 Co5 
We note that R* permutes the 27 lines in nine cycles of three (each cycle 
belonging to a tritangent plane*), namely 
R* = (dy Cr6 ) (dz C23) (As Cz) (Cs6 b; de) 
(C45 C26 C13) (a by C24) (C46 Cie C35) (C15 b, ls) (C14 Cos C36). 

On the other hand, the operator S = (3 4) K* (3.4)R permutes them in three 
cycles of nine. (We note that S*= 2*.) Another operator of the same kind is 
Cs6 Cre b; bs b6 ) 

(b, Cre Cog Az Cag ) 


: (C14 Dy C13 Co5 As Cas Cis Cog) 5 


° This is practically the “ substitution 7'” of Burnside, 7, p. 301. 

* Coxeter, 10, pp. 163-165. 

5 Coxeter, 11, p. 608. 

® The fact that the 27 lines are the complete intersection of the cubic surface with 
nine planes seems to have been first remarked by Baker, 1, p. 15. 


( 


THE POLYTOPE 25). 459 


in which the first cycle is the same as the first cycle of R, omitting the 
tritangent plane dz b4 


2. An alternative derivation of the cycles. A more elementary (though 
somewhat artificial) procedure is based on the observation that any six lines 
which are the lines of two tritangent planes can be arranged as a cycle, each 
skew to its two neighbors but intersecting the remaining three. For instance, 
the planes dz bz C23, C56 0s dg give the cycle 


(a2 C56 b; as); 


which is permuted by R*. Into this cycle, the three lines of any one of twelve 


rrr 
Qs 


Fig. 1. The construction of a skew hexagon or double-three. 


other tritangent planes can be inserted so that, in the consequent cycle of nine, 
each line is skew to its four neighbors (two before and two after) but intersects 
the remaining four (its four “opposites”). Inserting thus a, ¢.¢ b., we 
obtain the cycle 


(a, C16 Og Cos (ls) 


which is permuted by S~*. 

Into this last cycle we can insert the three lines of any one of three 
tritangent planes (from among the eleven just discarded) so that, in the 
consequent cycle of twelve, each line is skew to its six neighbors (three before 


and three after) but intersects its five opposites. Inserting thus a; b4 ¢34, we 
obtain the cycle 

(111 dz Az C56 Cig Oz Ds Coz 6), 
Which is permuted by FR, and whose square contains the original cycle of 
six lines, 


| 

| 

| 

~ 
~ 

~ 
~ 
~ 
. 


460 H. S. M. COXETER. 


3. The representation by points in the affine plane. Given two inter- 
secting lines (from among the 27) and a third line skew to both, there is a 


/ \ 
i 
a3 L, Do 


Fig. 2. The corresponding points. 


Cos 


C,, Cc 16 


Fig. 3. Three enneagons. 


uniquely determined fourth line, intersecting the third and again skew to the 
first two. For instance, the three lines a,, b2, a2 determine b,, so as to form 
the intersecting pairs a, b., a.b, (Fig. 1). This suggests the possibility of 


| 
| | 
be | 
@----@ 
\ 
\ 
1 
\ 
PCu 
\ / 
\ / 
} \ / 
| | 
{ 
| | 
| 


THE POLYTOPE 25). 461 


representing the 27 lines by 27 points, so that two such intersecting pairs are 
represented by the pairs of opposite vertices of a parallelogram, skew lines 
being represented by joined points. The transitivity of parallelism (Fig. 2) 
requires that any two other lines which intersect b., a, respectively, but not 
vice versa, should also intersect b,, a, respectively, so as to form a skew hexagon 
or “double-three” (Fig. 1). This is in fact the case; for instance, the two 


Cy 


Ces 


pf 


Oe 


16 


Fig. 4. Two dodecagons. 


other lines may be a3, 6,, completing the skew hexagon a, b. a; b, dz b;. Thus 
the representation is consistent. 

In order to make a diagram of pleasing appearance in the Euclidean 
plane, we begin by drawing a regular enneagon or dodecagon, to represent 
the lines of a cycle of nine or twelve as described in §2. The remaining 
representative points can then be derived by completing parallelograms. In 
Fig. 3, the three vertices a,, C23, b; of the outermost enneagon lead to ¢4¢, 


which is thus seen to be the point of intersection of the joins a; Cie, bs es63 


he 
4 
Q, d, 
/ \ 
\ 
\ 
i \ 
/ x \ 
\ 
\ / 
At, 
4 
4 
__ 
C45 Cr, 
Cis 
Ls 


462 H. S. M. COXETER. 


the rest of the smaller enneagon be bs C12 C34 C35 24.43 CAN be obtained 
similarly, or can be marked at once by using the second cycle of the operator 
Sin §1. Again, the three vertices as, Cis, bz of this second enneagon lead 
to ¢,;, Which is thus seen to be the point of intersection of the joins a4 be, b, az; 
the third enneagon C14 bi C13 C25 As C45 COFTesponds to the last cycle 


Fig. 5. The enneagonal projection of 221. 


of S-*. The complete diagram (Fig. 5, drawn by J. M. Andreas) has one 
unfortunate feature: 27 of the parallelograms (such as C35 C25 C12 C13, 1 as bs bi, 
a» a; bs b2) degenerate into lines containing four points,’ thus we have to think 
of 2; as being joined to ¢,, but not to ¢,3, and so on. 

In the dodecagon (Fig. 4), the three vertices a1, ¢23, bs lead to Cys, which 
is thus seen to be the point of intersection of the joins a, G16, bs a3; the smaller 
dodecagon C15 Cog De C12 by C24 Cg5 As C45 Ag CaN then be completed in ac 


7 This coincidence has been utilized by Rouse Ball, 2, p. 127. 


\ 
ZINES \ \ 
ZA \ | 
| S| 
SESS 


THE POLYTOPE 25). 463 


cordance with the second cycle of R. Considerations of symmetry suffice to 
show that the remaining three points (corresponding to the last cycle of RP) 
must coincide at the centre. In the complete diagram (Fig. 6), 24 of the 
parallelograms (such as a; a;b;b,, a1 a4 Cie Cas) Cegenerate as before, while 


Fig. 6. The dodecagonal projection of 221. 


another 24 (such as a, C25 by Cy6, Coa C14 C15 Cos) have two opposite vertices 
coincident (at the centre). 

Figs. 7 and 8 both show the points that represent the ten lines meeting b,,° 
and those that represent the six skew lines b; (i. e., one row of a double-six). 


Figs. 9 and 10 show the points that represent the nine lines 


C23 Cis Cae 
by 
bs ds Css 


The particular line b, is chosen because (4, C4 Cis C24 Ig) 18 One Cycle of the 
operator (34) R. We could obtain a third diagram by drawing this cycle as a regular 
octagon; then the three points b,, a;, ¢4;, would coincide at the centre. 


AAS, 
| 
\ 
DES 


464 H. S. M. COXETER. 


of a trihedral pair.° Fig. 9 makes it evident that three such sets of nine can 
exhaust the 27 lines, forming one of the forty triads of trihedral pairs. 


4, The representation by points in real Euclidean six-space. Let us 
now make the representation more symmetrical by insisting that the parallelo- 
grams shall be squares. We can no longer remain in the plane; Fig. 2 has to 


be regarded as a square and a triangular prism. The introduction of a, and b; 
necessitates a fourth dimension; of a; and b;, a fifth; of ag and be, a sixth. 

Let « denote the regular simplex (with p-+ 1 vertices) in p dimensions. 
Then the square and the triangular prism can be written as “ rectangular 
products ” a, a, a. The “ double-n ” 


42° * *An 


b, be: bn 


® Steiner, 27. 
**Compare Coxeter, 11, pp. 591-592, where the rectangular product a, X 4, }8 
written [a,, a]. Sommerville, (26, p. 114), calls this a “simplotope of type (p,4%)- 


| 
| 
TY 

RY 
| 
| 


THE POLYTOPE 221. 465 


(n<=6) is then represented by the rectangular product n+ a in n 
dimensions, i.e., by a right prism whose base is the regular simplex 1. 
We are now representing the lines by points in six dimensions in such 
a way that the distance between two of the points is 1 or 2% according as the 
corresponding lines are skew or intersect.’ This applies to the c’s as well as 
to the a’s and b’s; for, we may take the representative points to have the 


following Cartesian codrdinates in eight dimensions: '” 


Fig. 8. and 


a, (2k, 0, 0,0, 0,0, 2k, 0), b, (2k, 0, 0, 0, 0, 0, 0, 2k), 
a» (0, 2k, 0,0, 0, 0, 2k, 0), be (0, 2k, 0, 0, 0,0, 0, 2h), 
a (0, 0, 0, 0, 0, 2k, 2k, 0), be (0,0, 0, 0, 0, 2h, 0, 2k), 
Crp (—k,—k, k,k,k,k), (hk, k,—k,—k,k,k). 


The number of dimensions is reduced from eight to six by the relations 


. . « 2/9 
Since the distance a, a. is 23/2k, we have k = 2-%/2, 


“To put it rather pedantically, the distance is (r + 1)“ when the two lines have 
r intersections. See Du Val, 15, p. 28. 
* Coxeter, 8, pp. 3, 6. 


466 H. S. M. COXETER. 


This representation -was discovered by P. H. Schoute and explained by 
J. A. Todd.’* It is perfect, in the sense that every automorphism of the 27 
lines corresponds to a symmetry of the 27 points, and conversely. 

We observe that all the above codrdinates satisfy the condition «, < 2k, 
equality holding for the five-dimensional simplex b, be bz bsbs be (corre- 
sponding to one row of a double-six) ; and that they all satisfy the condition 


v, + @, = 2k, equality holding for the five-dimensional cross-polytope (0 
octahedron-analogue) whose pairs of opposite vertices are 
De, az by, Cs C56, C35 C16, Cae Cas 

(corresponding to the ten lines which intersect ¢,.). Continuing thus, it can 
be shown that the 27 points in six dimensions are the vertices of a semi-regular 

polytope whose five-dimensional faces are regular polytopes of two kinds: 1 
simplexes a; (belonging to the 36 inscribed a; & @,’s) and 27 cross-polytopes 


B; (one opposite to each vertex). The numbers of edges 2,, triangles % 


18 Schoute, 24, pp. 375-383; Todd, 28. 


ti 


4 
Crs 
wy 
/ x 
/ \ 
/ \ 
\ 
/ 
j \ 
/ \ 
/ \ 
/ Ci Cr \ 
/ \ 
| 
\ | 
\ 
\ H 
\ 
\ 
| 
i a, 
» 
Fig. 9. @. 
{ 


THE POLYTOPE 25). 467 


y tetrahedra a;, and “pentatopes” a,, are respectively 216, 720, 1080, and 
7 432 + 216.14 


This six-dimensional polytope,’’ now known as 


™ Compare Henderson, 20, p. 25. 
** Coxeter, 18, p. 331. Cf. Fig. 11, below. 


n 
~ 
\ 
/ \ 
\ 
/ 
/ 
/ 
Css \ 
/ \ 
/ \ 
/ \ 
/ 
ds 
i 
| 
| 
| 
| Cr 
| 
| 
| 
| 
2 / 
/ 
\ 
/ 
\ 
\ 
\ Cas C / 
15 / 
\ 
/ 
| 
3 
Fig. 10. a X a. 


468 H. S. M. COXETER. 


or 22;, was discovered by Thorold Gosset.*° Figs. 5 and 6 can be regarded as 
plane projections of its vertices and edges. Squares such as a,a; b;b, are 
foreshortened into lines. Figs. 7 and 8 show two five-dimensional faces: a 
8; and an @;. The reader will have no difficulty in picking out from Figs. 5 
and 6 (with the aid of Figs. 3 and 4, respectively) the prismatic figure a, X a, 
whose vertices are all the a’s and b’s. Figs. 9 and 10 show an inscribed four- 
dimensional polytope, the rectangular product of two triangles, a. X 2, whose 
solid faces consist of six triangular prisms a2 X @, (or @ X @2), all plainly 
discernible. It is interesting to compare the different views in the two 
projections. 


5. The representation by points in complex Euclidean three-space. 
Witting *’ has shown that the simple group of order 25920 can be represented 
as a collineation group in four variables. Burkhardt ?* used the transforma- 
tions B, C, D, S, of Table I (at the end of this paper) to generate the 
corresponding linear group of order 51840, and remarked that the simple 
group itself is generated by linear transformations of the six Pliicker codrdi- 
nates, as in the middle section of the table. (The actual transformations are 
easily deduced from the first section by regarding pyy as a symbolic product 
ZyZ.1°) 

The final section of the table contains the corresponding transformations of 


= Poi — P23, = Poz — L3 = Pos — 


These are not strictly “linear transformations,” since D involves the con- 
jugate imaginaries Z, etc.; however they are “ unitary,” in the sense of leaving 
2,2, + 2%, + 234%; invariant. Consequently, if we write vy = yv + Yy,3t where 
the y’s are real, the corresponding transformations of the six variables 
Yis Y2s Ys, Yo are orthogonal.”? Using the terminology of the six- 


1¢ Gosset gave this and many other new results in a long and brilliant essay which 
was refused publication in 1897. (For an abstract, see 19.) Since then, 2,, has been 
rediscovered at least three times. 

17 29. 

186, pp. 318, 320. 

1° Dr. Frame has drawn my attention to the fact that this representation of degree 
six is an irreducible component of the Kronecker square of the representation by quater- 
nary collineations, the other component being a representation of degree ten which is 
the corresponding symmetrized Kronecker square. It is also an irreducible component 
of the representation by permutations of the 27 lines on the cubic surface. He gives 
the character of the representation by quaternary collineations as + 4,0,0, +2 
+ (307 +1), + (8w+1), + (w?—1), +(w—1), +1, + 
+ 2,0,0,0,0, +1, +w?, +w, taking the classes in the same order as in his table, 
17, p. 483. 

20 Cf. Burkhardt, 6, p. 326. 


i] 


bal 
| 
[ 
| 
F 
i 
| 
i 
| 
| 
| 
| 


THE POLYTOPE 469 
dimensional Euclidean space defined by the y’s, we may say that the vector 
(X;, 42, Xs) is perpendicular to the hyperplane 

Moreover, if X,X,+ X;,X,—1, the reflection in the hyperplane 


u= 0 ws the transformation 


2’. = Ls — 


For, letting m, » take the values 1, 2, 3, 4, 5, 6, and writing Yy = Vy + Yy,st 
(v= 1,2,3), we have 


= 23 + = 2u; 


and the reflection in the hyperplane = Ynyn=0 (where 3Y,? —1) is 
4 


which leads at once to the above transformation (5.1). 
Corresponding to the 27 lines on the cubic surface, Burkhardt *4 found 
27 linear complexes 
@ — Dog — + opis = 0, 
(5. 2) -—— — © Piz + wo = 0, 
D Por — Doz — wpug + ws, = 0. 


(A, p= 0,1, 25 w = 
In terms of the 2’s, these are the hyperplanes 


which are perpendicular to the vectors 
(5. 3) (0, wo, — (—o*, 0, w), —w', 0). 


By keeping this selection of signs, we now have 27 points of the complex 
Euclidean (or “ unitary”) three-space which are permuted among themselves 
by the transformations B, C, D, Sz of the x’s (see Table I). 

’s we thus find 2 


In the real Euclidean six-space defined by the y 7 points, 
which correspond to the lines on the cubic surface, and which are permuted 


16 p, 323 (14). 


AS 

re 

a 

5 

j 

1 

se 

y 

0 

t 

f 


=e 


470 H. S. M. COXETER. 


among themselves by a rotation group of order 25920. We may naturally 
expect these to be the vertices of the polytope 22,. Such is in fact the case, 
For, by considering various pairs of the points, we find that the distance 2 
between (21, and is 6% if a+ for just one of 
the three values of v, and 3% otherwise. This agrees with the known corre- 
spondence between the 27 lines and the vertices of 22, (edge 3%) if we make 
the points (5.3) represent the lines 


UrSp, 


respectively, in the notation of Philip Hall.** The three lines of a tritangent 
plane (such as toto, UoSo, Solo, OF Solo, Siti, Sele) are represented by the vertices 
of an equilateral triangle whose centre is the origin, i.e.?4 by three vectors 
whose sum is (0,0,0). 


6. The 36 double-sixes and the polytope 1... As we remarked in § 1, 
the whole group of automorphisms of the 27 lines, of order 51840,*° can be 
generated by certain operators each of which interchanges the two rows of a 
double-six. Since the double-six is represented by the prismatic figure %; X 4, 
(§ 4), it is geometrically evident that the corresponding orthogonal trans- 
formations in Euclidean six-space are reflections which interchange pairs of 
opposite a;’s of 2.,;. These may equally well be described as reflections which 
interchange pairs of opposite vertices of the “ semi-reciprocal ” polytope 1,2. 

This six-dimensional polytope, whose 72 vertices are the centres of the 
a;’s of 2.,, was discovered by Elte.*° The numbers of edges, triangles and 
tetrahedra are respectively 720, 2160 and 2160. Four-dimensional elements 
of two kinds are involved, namely 432 a4’s and 270 B,’s; but the five- 
dimensional faces are all alike, being 54 “ half-measure-polytopes 77 1.,. From 


(| 2"; | L,— x’, | x"; |*)%. 

*3 See Coxeter, 9, p. 396. (It is obviously immaterial whether we take the suffix 
numbers to be 0, 1, 2 or 1, 2, 3). In this notation an operator of period nine (such 
as the S or of §1) loses its artificiality, being expressible as (8, 82 tzU2) 
thus the cycle of nine lines required in the construction of Fig. 5 is simply 

** Cf, Frame, 18, p. 660. We shall have further comments to make on that paper 
in § 8. 

25 This must not be confused with the above mentioned linear group of order 51840, 
which has the simple group as a factor group but not as a subgroup; nor with Witting’s 
collineation group of order 51840, which is a direct product, as was pointed out by 
Maschke, 21, p. 321. 

7° 16, pp. 104-108. 

*7 The vertices of the ( + 3)-dimensional polytope 1, are alternate vertices of the 


V 
| ( 
| 
i 
0 
e 
( 
| 
0 
0 
fe 
| 


THE POLYTOPE 22). 471 


each of the 72 vertices emanate twenty edges which belong in pairs to ten 
regular hexagons lying in planes through the centre; hence there are altogether 
72°10/6 = 120 such diagonal hexagons. The 27 pairs of opposite 12,’s 
correspond to the lines on the cubic surface, the 36 pairs of opposite vertices 
correspond to the double-sixes, and the 120 diagonal hexagons correspond to 
the trihedral pairs.** Moreover, the planes of these hexagons fall into 40 sets 
of three absolutely perpendicular planes, corresponding to the triads of tri- 
hedral pairs. 

New coordinates for the vertices of le. can of course be derived from 
those which we found for 22, in § 5, but it is perhaps more interesting to 
obtain them from the collineation group. Corresponding to the 36 double- 
sixes, Burkhardt *® found 36 linear complexes 

3%i(Dpor + opos) = 0, 

341 + wps,) = 0, 

321 poz, + = 0, 

O* Por + Dox + of — po, — wo — Dio = 0. 
(x, A, p= 0, 1,2). 


(6.1) 


In terms of the z’s, these are the hyperplanes 
| + + ofr, + + + or, = 0, 


which are perpendicular to the vectors 


(6.2) 


(6. 31) 0, 0), (0, +3%in', 0), (0, + , 
(6. 32) (+ +o, + (all + or all —). 


In this case there is no systematic way of picking out one from each pair of 
oppositely directed vectors, but the 72 points having these codrdinates are 
easily recognized as the vertices of 1» (edge 3%). In particular, the point 
(1,1,1) corresponds to the simplex (0,1, —o*!), (—o*', 0,1), (1, —o*, 0), 
of 

Clearly, the six points (+ 3%iw,0,0) are the vertices of a diagonal 
hexagon, and we have a simple verification of Todd’s remark *° that twelve 
of the 120 diagonal hexagons can be selected so as to include all the vertices 
of 1,. just once. 
measure-polytope or hyper-cube, y,,,-_ Thus 1), is the tetrahedron (having alternate 
vertices of the cube, 7,), and 1,, is the ecross-polytope 6,; but 1,, is not regular 
for 

** Todd, 28, pp. 204-205. 

p. 325. 

99 28, p. 205. 


ly 
30, 
of 
e- 
ce 
it 
ag 
e 
a 
1 
f 
1 


472 H. S. M. COXETER. 


To normalize the hyperplanes (6.2) we multiply by 3-%. 
in — oZ,) = 0 is 
= — iw) i (ox, —w*Z,) 
(6. 4) X's = Le, 
= Lz. 
The reflection in 3-4(z, + + + + + =0 is 


8, 
(6. 5) = 8, 


= 23 — 8, 


where s = (2, + 22+ 2; + + + Zs) /3. 


The reflection 


The group of order 51840 which such reflections generate is conveniently 


3,3 
denoted by [33 | or (for brevity) [37%*], since it has the abstract 
3 


definition *! 
0? = N? = N,? P? P,? = Q? 


(6. 6) 


(ON) (NN,)* = (OP)* (PP)? (OQ)* 
— (ON,)* = (OP,)? = (PQ)? = (QN)? = (NP)? 


= (7,0)? = (QN,)? = (NP)? = (NP)? = (N,P;)? = 


N, 


Q 
Fig. 11. The group [3”*?]. 


In order to generate the group in this elegant manner, we have to select six 
of the 36 reflecting hyperplanes in such a way that the angle between two of 
them is z/2 or 7/3 according as the period of the product of the reflections 
is 2 or 3. In other words, we have to select six vertices of 1,. which are 


31 Coxeter, 10, p. 164. 


Tl 


or 


st 
( 
by 
te 
p 
ly 
(s 
he 
ta 
| T 
fo 
: 
= 
Tl 
or 
se 
re; 
Bi 
of 
Sif 
| the 
ane 


THE POLYTOPE 25. 473 


connected by five edges as in Fig. 11. In this diagram ** it is to be under- 
stood that pairs of vertices not joined by edges are distant 2% edge-lengths 
(= 6%). N, denotes one of the two opposite vertices which are interchanged 
by the reflection N,; similarly for the other letters. 

Since the transformation (6.4) is simpler than (6.5), it is desirable to 
take as many as possible of the vertices from the set (6.31) and as few as 
possible from (6.32). The former set of vertices belong to three hexagons 
lying in absolutely perpendicular planes; so we cannot use them exclusively 
(since Fig. 11 is connected). We take NN, to be a side of one of these 
hexagons, PP, to be a side of another, and Q to be a vertex of the third. 
The remaining vertex, O, has to be chosen from the other set, and we naturally 
take it to be (1,1,1). We thus obtain 

N, (—3%i, 0,0), P, (0,—3%1, 0), 
N (3%iw*, 0,0), P (0, 3%iw*, 0), 
O (1,1,1), 
Q (0,0, 3%iw?). 
The corresponding reflections are given in the first section of Table II. 


7. Collineations which generate [3° *:']’ according to a known abstract 
definition. Writing Po. — pos, Poz Pos — Pie for v1, V2, V3, the trans- 
formation = 2, —s of O (Table IJ) becomes 


— = Po. — pos — (t-+ #), where 
t= (Por + Poz + Pos — Pos — Psi — /3. 


This is consistent with either 
or 
Por = — P23 — ft, = — pion + i. 


Thus the group [3** +] is generated by collineations ** in the six variables pu, 
or equally well by anticollineations. (For details, see the second and third 
sections of Table II). In the case of the collineations it is impossible to 
regard the p’s as Pliicker codrdinates in a projective space defined by 2p, 21, 22, 23. 
But in the case of the anticollineations this can be done, as in the final section 
of the table. An ambiguity of sign appears at this stage, since reversing the 
signs of the z’s has no effect on the p’s. Hence although the group generated 


** The reader may be interested to draw plane projections of 1,, corresponding to 
the projection of 2,, shown in Figs. 5 and 6, and to pick out a set of five edges related 
as in Fig. 11. 

*® When working with the p’s there is no need to distinguish between collineations 
and linear transformations. (Burkhardt, 6, p. 320). 


2 


474 H. S. M. COXETER. 


by anticollineations in the z’s is still [3**+], the transformations themselves 
generate a group of order 103680 which we shall call the binary [37?1].*4 

The signs have actually been chosen so as to make the transformations 
satisfy (6.6) with the “17” omitted. These relations provide an abstract 
definition for the binary [3**1]. For, we could have continued them by 
writing “= 7Z, Z* 1”; but it is unnecessary to do so, since the relations 
P? = 2? = (PQ)? =Z imply 7? =1 (and are fulfilled by the quaternion 
group). 

The most important subgroups are {N,,N,0O,P,P,}, of index 72: 
{N, 0, P, P:, Q}, of index 27; and {N,N, N,0, N,P, N,P,, N,Q}, of index 2. 
The last of these, having the abstract definition 

8) == (UV)* == (VW)? = (WX)? = (VY)? 
a= (UW)? = (UX)? = (UY)? = (VX)? — (WY)? = (XY)’, 

is naturally called the “binary [37 *7*]’,” since by adding the extra relation 
we obtain [3*?1]’, which is the simple group itself.** Table III 
(immediately deducible from Table Il) gives linear transformations of the 
p’s which generate [3**1]’, and linear transformations of the z’s which 
generate the binary [3**"]’. These transformations, while possibly no more 
elegant than Burkhardt’s, have the advantage of generating the groups ac- 
cording to a known abstract definition. 


The operator of period twelve considered in § 1 is 


(1.2) R=—UVWXY 


——i) 0 0 0 k 0 k —k 36 0 0 0 —w? 
0 —w ) 0 0 k k k 0 0 ——p 0 
0 0 —w? 0 kk 0 0 0 0 


0 0—w —kk wo 0 0 O 
0 0 0 —I1 ( 00 ow 0 


00 —1 0 00 0 —w 
01 0 0 —w 0 0 0 
0 0 0 0 0 0 


—wok —ok wk 0 
wk 0 —k 
k —wk |° 
0 —wk—k —ok 


** By analogy with the binary icosahedral group P* = Q* = (PQ)?. (See Seifert 
and Threlfall, 25, p. 218.) Since the binary icosahedral group arises as a linear group 
in two variables, and the binary [3*.*;1] as a linear group in four variables, it would 
perhaps have been better to name the latter the quaternary [3?,?,1]. 

Coxeter, 10, p. 160. 

36k = (w— w*) /3 = 


By 


ope 


a 

| 
} 
hh 
Be 
x 


THE POLYTUPE 25. 475 


The central is generated by 


0 0 0 

0 O 

0 


We have seen elsewhere ** that the abstract group [3%*7] is generated by 
the two operators R and O. The results on which this statement is based apply 
to the binary without change. (For instance, QOQ = 0'Q7°07°Z 
= 000, QPQ=P'Z=P.) It can be shown similarly that the binary 
[3° 1]’ is generated by R and XY. In fact, writing for brevity 


Tn = R-"XR", we have 

=X = N;P,, 

T;= - ROP, = P,O, 

T, = OPO: 0Q0 = OPZQO, 

To = ROR = PN, 

T, = = OQON,, 
whence 

T Tl; = N;,Z0, 
= OQZ, 
T.T,T oT; = OP, 
and finally, since 
== X? X-2, 

= ZT oT = OXR?(XR*) 


In other words, the group of linear transformations, of order 51840, is 
generated by 


—ok —ok wk O- 00 
—wok ok 0 —k 00—1 0 
wk 0) k —wok 0 1 0 0 

| 0 —wk —k —wk 10 0 0 


By the same kind of argument, the second generator can be U instead of X.** 
Since the congruence 


a+ 1==0 (mod 7) 


*? Coxeter, 11, p. 615. 
** Brahana (4, p. 533) proved that the simple group [3*,%,7]’ is generated by two 
operators which may be identified with our UVW and XY (or N,NOP and P,Q). 


476 H. S. M. COXETER. 


has the roots 2 and 4, one of the simplest modular representations *® of, the 
binary [3%*+]’ is derived by putting 2 for o, 4 for k,*° and regarding all the 
coefficients as residues modulo 7. Thus the group is generated by the 


transformations 

4043) 0003 

0030)’ 4430)’ 0400]? 
(0003 403) 2000 

0060 0005 
919061’ 5000 
(1000) (0400) 


in the field GF [7], or alternatively by X (or U) and 
| 6620 
610 3 


05 3 6 


The reader will probably agree that these matrices are easier to manipulate | 
than those of (7. 2). 


8. The representation by lines in PG (3,4). Instead of GF [7], we 
may use the field GF [2?] defined by the irreducible congruence 


w+ 1=0 (mod 2). 


Tables I, II, III all remain valid if we interpret w as a root of this congruence 


(a primitive root in the field) and define the conjugate @ of any mark w to 
be its square. Since — 1=1, we can replace every minus sign by plus. Also | 
k = (w—o*)/3=o+o0*=1. Since Z =—1, there is no longer any distinc- 
tion between collineations and linear transformations. We easily verify that | 


the abstract definitions (6.6) and (7.1) (with “—1”) are satisfied. ‘The 
transformations 


000 0 1011 00 
0w 0111 00 w 0 
0 0 w 0 2219 0 0 0 
000 1101 00 00) 
8° Brauer and Nesbitt, 5, p. 6. 8 


k = (w—w?)/3 = (2— 4)/3 = 4 (mod 7). 


| 


THE POLYTOPE 2.,. 


0001 00 w 0 

0010 00 0 w 
X= 7 

100 0 0w? 0 0 


heing unitary, generate qua HO(4, 27)."! 

Following Frame,*? let us regard these as collineations in the finite 
geometry PG(3,4) with homogeneous codrdinates (2, 21, Z2, 23). We observe 
that the line whose Pliicker codrdinates Po2, Posy Poss Ps1y Piz) are 
(0, w*, w*, 0,,) is invariant under the transformations N,, NV, O, P, Q of 
Table II, and consequently also under the transformations U, V, W, Y of 
Table III (with the coefficients modified to fit the Galois field, as described 
above). This line is transformed into (0, o, 0”, 0,07,0) by P, or by X, and 
into the 27 lines 

(0, 0, o, 
(8.1) (w4, 0, 0, 0%), 


wo, 0, o, oH. 0), 


9 


by other operators of either [3771] or [3771]. 
In other words, since the coefficients of the 27 linear complexes (5. 2) 


satisfy 


+ Moots, ost, = —- 2 =D, 


the corresponding linear complexes in PG (3,4) are special, each consisting 
of all lines which meet one of the lines (8.1). On the other hand, the 36 
linear complexes (6.1) remain general, since for them the invariant is 
—3=1. 

Another consequence of reducing the coeflicients to marks of GF [27] is 
that the cubic form 2° + z,° + z.*-+ z,° is now invariant under the group 
[321]’ generated by U, V, W, X, Y; it is transformed into its conjugate 
by the anticollineations which generate [37°**']. Thus the “ cubic surface ” 


(8. 2) + 2,3 + + = 0 


in the finite geometry PG(3,4) is invariant under a group of collineations 
and anticollineations which is simply isomorphic with the group of auto- 
morphisms of the lines on the general cubic surface in ordinary projective 
space. This result suggests the possibility that all the automorphisms of the 
lines on the special cubic surface (8.2) can be realized as collineations and 


“1 For alternative generators of HO(4, 27), see Frame, 17, p. 482. For other repre- 
sentations of [3%,2,7]’, see Dickson, 14, p. 298, and Brahana, loc. cit. (4, p. 533). 
42-18. 


t 
i 


478 H. S. M. COXETER. 


anticollineations. This is in fact the case, since the lines in question are 
precisely (8.1). The nine lines (0, 0, o“, 0, 0,0“) belong to the trihedral 
pair ** which is put in evidence by writing (8.2) in the form 
(% =- (Zo (Zo ) + (Z2 23) (2 wZ3) (22 + = (), 
These “trihedra” are degenerate, each consisting of three planes through a 
line. Hach of the planes is therefore met by the opposite “trihedron” in 
three concurrent lines (in contrast to a tritangent plane of the general cubic 
surface, in which the three lines form a triangle). 
The same thing happens in the case of the special cubic surface 
Zo* +- z,° +- 0 
in ordinary projective space. The 18 planes 
ap + = 0 
each contain three concurrent lines of the surface. But the 27 planes 
Zo + + + = 0 


are proper tritangent planes, each containing a triangle. EH. g. the plane 
Zo + 2, + 2 + 2, =0 contains the triangle cut out on it by the planes 

Zo ot 0, Zo Z2 0, Zo =— 0. 
This dichotomy of the 45 tritangent planes indicates that the group of auto- 
morphisms of the lines on this special cubic surface is a proper subgroup of 
[371]. But the full symmetry is restored when we pass to the finite 
geometry by regarding the z’s as marks of GF[2*]. For then all the 45 


planes degenerate the same way; e. g. the planes 
~+%4,=0 


concur at the point (1,1,1,1). 
The configuration symbols for the whole space ** PG@ (3,4) and for the 


cubic surface (8.2) are easily seen to be: 


85 8621 21 45 3 13 
5 857 5 27 5 
21 21 85 13 38 45 


By this we mean that the “surface” contains 45 of the 85 points of the 


48 Cf, Frame, 18, p. 661. 
44 Schoute, 23, p. 5 (m= 4). 


0! 


| 

0 


THE POLYTOPE 25). 479 


space, and 27 of the 357 lines; and that the 45 isotropic *° planes of the space, 
each containing three of the 27 lines, may be regarded as “ tangent ” planes 
to the surface. The configuration thus determined is self-dual. There is a 
one-one correspondence between the 45 points and the 45 planes, each point 
being the point of concurrence ** of the three lines in one of the planes. Each 
plane contains the corresponding point and twelve other points of the set. 
Each line contains five points and lies in the five corresponding planes. 

The plane corresponding to (Zo, 21, 22, 23) 18 (Uo, U1, U2, Us) Where Uy = 2y. 
Hence the Pliicker codrdinates of the lines (after multiplication by o or © 
if necessary) satisfy 


Por = P23, Por = Psi; Pos = Pres 


as in (8.1). Since porpes; = por” = por, the first three of the Pliicker codérdi- 
nates, so normalized, are the same as Frame’s “ non-homogeneous coérdinates 
(4),” while the remaining three are their respective conjugates. Frame’s 
rules for the incidences (Theorem 2) follow immediately. In particular, the 
three sides of a triangle on the general cubic surface correspond to two inter- 
secting lines and a third line which is linearly dependent on them (i.e., 
concurrent and coplanar with them). 


9. The 120 trihedral pairs and the polytope 4.,. Witting has shown * 
that the groups we have been considering have subgroups of index 40 of two 
distinct types. In the complex projective space with codrdinates (Zo, 21, 22; 23)» 
the plane z 0 is transformed into 40 planes, and the set of four planes 
221222, = 0 is transformed into 40 tetrahedra. The vertices of the tetrahedra 
are 40 points whose codrdinates are the same as the tangential codrdinates of 
the planes. The 40 planes are * 


zy = 0 (v = 0,1, 2,3), 

2, + we. + oz, = 0 (A, » = 0,1, 2), 
— — wht, + wz, = 0, 
— w + wiz, —%,=—0, 
— — w2;, + == (), 


or, When multiplied together, = 0 in Maschke’s notation.*® 
When we interpret the coefficients as marks of the field GF[2?] (and 


** Frame, 18, p. 659. 
*°Thus the phrase “ which form a triangle’ 


should be deleted from the middle 
of p. 659 of Frame’s paper. 

*7 29, pp. 41-43. See also Burkhardt, 6, p. 319. 

*® Blichfeldt, 3, p. 151. 

21, p. 333. 


480 H. S. M. COXETER. 


therefore ignore the negative signs), these are precisely the 40 non-isotropic 
planes of PG(3,4). Hence, by Frame’s Theorem 3,°° the 40 tetrahedra 
correspond to the 40 triads of trihedral pairs of the cubic surface. In 
particular, the tetrahedron 292;2223 = 0 corresponds *' to the triad 

boty | | Solo Site Sel, 

| 148; UpSe| Sit, Seto Soto 

tot, |U2S2 Sete Sot Silo 
in Hall’s notation. 

When we interpret (2, 2, 22,23) as non-homogeneous codrdinates (in 
complex Euclidean four-space), we find that the point (3%7,0,0,0) is trans- 
formed into the 240 points 
(+ 0, 0, 0), (0, + 0,0), (0,0, + 3%iw, 0), (0,0,0, 3%iw), 
(0, + w*, + ow, + wo) (with signs agreeing), 

(9. 1) 4 0, + w), 


(=x: wo, + 0, + 


These correspond in sets of six to the 40 points considered above in the pro- 
jective three-space, the six points of each set being derivable from any one olf 
them by multiplying all four codrdinates by the same power of —wo. Thus 
(+ 3%iw',0,0,0) is one such set of six, and (0, + +, + is another. 
When interpreted as points of real Euclidean eight-space, each set of six 
forms a regular hexagon. We thus have 40 hexagons lying in different planes 
through the origin. We shall see that the 240 points, so interpreted, have a : 
far greater degree of symmetry than the 40 planes in which they lie; in other l 
words, the 240 points can be distributed among 40 hexagons in many different 
ways. We shall prove, in fact, that the 240 points are the vertices of another 


of Gosset’s semi-regular polytopes, namely , 


5° Frame, 18, p. 660. “ 
detail, (1,wA,0,0) and (0,0,1,w#) give the line (0,1, w#,0,wd+u,w\). To 
normalize this, we multiply by obtaining (0, w\tu, wd-u, 0, w-r-u, OF 
(0, wAt+H, wr-H) or Daan. . To arrange nine such lines as a trihedral pair, we fix 


for the rows and yw for the columns. 


i 
ge 
{ no 


THE POLYTOPE 25. 481 


or 42, (of edge 3%). This eight-dimensional polytope ** has seven-dimensional 
faces of two kinds: 17280 simplexes @;, and 2160 cross-polytopes B;. The 
numbers of edges, triangles, tetrahedra, a,’s, %5’s, and a ’s are respectively 
6720, 60480, 241920, 483840, 483840, and 138240 + 69120. Its symmetry 
group [3**"], of order 696729600, is generated by reflections ** in hyperplanes 
which perpendicularly bisect the joins of pairs of opposite vertices. 

Hight of these 120 reflections suffice, the abstract definition being * 


= P? = P= Q? 
=(ON)* =(NN,)* =(N.N2)* =(N2N5)* =(OP)* =(PP,)* =(0Q)* 
—(ONy)? =(OP,)? =(PQ)? =(QN)? =(N,Ng)? 
—(P,Q)? =(QNv)? =(NVP)? =(NP,)?=(NvP,)?=1 (v==1, 2,3), 


N; P, 


Q 
Fig. 12. The group [3***]. 


analogous to (6.6). There is a central of order two, whose quotient group is 


derived by inserting the extra relation 
NOP P,Q) =1. 
Seven rotations, such as 
N;N2, N3Ni, N3sN, N30, N3P, NsP1, 


generate a subgroup of index two. This subgroup of the central quotient 
group °° of [3**1] is the simple group FH (8, 2), of order 174182400. 


*? Gosset, 19, p. 48. 

3 Coxeter, 9, p. 388. 

** Coxeter, 10, p. 171. The subgroup [3?,*,1] generated by O, N, N,, P, P,, Q has 
no direct connection with the binary [3?.*,4] generated by linear transformations of 
the z’s. (The O, N, etc. of Table II are of period four). 

** Coxeter, 10, p. 174. There, and on p. 179, the word sub-group has several times 
heen used by mistake for factor group. 


) 
Nz 
N; P 
N 


482 H. S. M. COXETER. 


In order to prove that the points (9.1) are the vertices of 42:, we select 
eight of them whose mutual distances are indicated in Fig. 12. By com- 
parison with Fig. 11, we easily find one such set of eight points to be 


N, (3%i,0,0,0), 
Nz (— w”, 0,— 
N, (0,—3%i, 0,0), P, (0,0,—3%i, 0), 
N (0, 0, 0), (0, 0, 0), 
Q (0, 0, 0, 3%iw?). 


The corresponding reflections are given in the first section of Table IV. It 
only remains to be observed that the given set of 240 points is invariant under 
these transformations. 

When we pass to the real Euclidean eight-space, a more natural form of 
coordinates for Fig. 12 (on a different scale) is °° 


N; (2,—2, 0,0, 0,0, 0,0), 
N, (0,—2, 2, 0, 0, 0, 0,0), 


N, (0, 0, 2, — 2, 0, 0, 0, 0), P, (0, 0, 0, 0, 0, 0, — 2, 2), 
N (0,0, 0,— 2, 2, 0, 0, 0), P (0,0, 0,0, 0,2,— 2,0), 


O (0,0, 0, 0, 2, 2, 0,0), 
Q 4,4, 5,3; 


The group [3** 7] is then generated by reflections in the hyperplanes 


Yo= "1; 
= 
Yo = Yo = 
Y3 = Ys = Yeo; 
Ys + = 0, 
sy =0. 


(See the second section of Table IV). 

Since the symmetry group of 42, is [3% 1+], of order 696729600, while 
that of the 40 hexagons is the binary [3** 1], of order 103680, it follows that 
such a set of 40 hexagons, which together use up all the 240 vertices of 4::, 
can be selected in 6720 ways. Having made one such selection, we find that 
the planes of the 40 hexagons (which correspond to the non-isotropic planes 
of the finite geometry *’) fall into 40 sets of four absolutely perpendicular 


5° For the consequent codrdinates of all the vertices of 4,, (edge 2°/?), see Coxeter, 
8, p. 2. ((PA)s is an alternative symbol for 4,,). 
57 Frame, 18, p. 660 (§ 4). 


| 
| 
h 
| 
tl 
| 


THE POLYTOPE 25). 483 


planes. Hence, finally, these forty sets of four planes correspond to the forty 
triads of trihedral pairs of the cubic surface. 

Another subgroup of index 6720 in [3**"] can be derived by specializing 
a single one of the 1120 °* diagonal hexagons of 4, (instead of forty of them). 
For, in (9.1), the plane of the hexagon (+ 3%iw*,0,0,0) is absolutely per- 
pendicular to the six-space 2) = 0, which contains the 72 points 


(0, 3%iwr, 0, (9+ 3 6), (0,6,0, + 


(0, wo, wv"), (0, —o*, — — 


By (6.3), these are the vertices of the six-dimensional polytope *® 1s., whose 
symmetry group, [3%**] & Gs, is derived from that of 22, by adjoining 
the central inversion. In the notation of Table IV, this subgroup is 
{N3, N, O, P, Q}. 

The 40 planes and 40 absolutely perpendicular six-spaces give, by inter- 
section with a six-space of general position, a configuration of 40 points and 
40 four-spaces, which is the real counterpart of Witting’s configuration of 
40 points and 40 planes in complex projective three-space. Witting described 
his configuration as the three-dimensional analogue of the Hessian configura- 
tion of 9 + 12 points and 9 + 12 lines in the complex projective plane. The 
real counterpart of the latter comes from the planes of nine diagonal triangles 
of 2., and of twelve diagonal hexagons of the semi-reciprocal 1.2, together with 
the absolutely perpendicular four-spaces, by intersection with a four-space of 
general position. 

TABLE I 


The simple group [3*,7,1], of order 25920, as a collineation group (after Burkhardt) 


B C D S, 
2, = —k(2, + 2 + 23)* 
2’. —k(z2, + wz, + 2, — zk, 
(2, + + w2,) 2s 23 
Pn = — K (por + Por + Dos) Pos — Pur w* Dor 
= — k( Por + Po ) Por Pre 
P's = — Kk (pox + Doz + Dos) Po Par WPo: 
= Ke (pug + Dar + Piz) Pr2 — WP 
= (Poy + w? psy + Pos Pos 
= Kk (Pog + + Py) Por 
= —k(a, + + #3) — 
x", k (ay + + ) WL 

* k =(w— w*) /3 = 3-1/21, w = 
58 Coxeter, 10, p. 181. 


°° Coxeter, 10, p. 178; 12, p. 477. 


484 


The group [3?,%,1], of order 51840, generated by anticollineations 


H. S. M. COXETER. 


TABLE II 


Tt =(Por + Doe + Pos — — Pi — Pz) /3. 


N, N O P, Q 
wi, r,—s* * 
= — Pr Pa — tt Pos Po 
= Po Poz Poa —t —Pa Po 
P = Pos Pos Pos — t Pos — 
= —Pa — Px +t Pes Pos 
Pa = Pot Pa +t — —Po Ps 
= Pos wpor —— Pos — t — Pas — Pas — Pu 
= — Du — pas — ps —t WPo2 Poa Pu 
P'os = — Py — Pu —Pu—t — Pa — Pr WD og 
P 23 Pea P23 Por Por Pos Po 
P's = — Do — Po — Pu +t w* Pas 
— Pos — Pos — Dos — Doz w* 
k (2, + 2+ 23) wZ, 2, wz, 
— 2 — wZ, k(—2,— 4%, +2,) —w2, — 2, 
= 23 wz, k(—2% + 2,—23) —wZ, — 2%, — we, 
—2, — wi, 4%, + 2) we, 2, w*Z, 
4, + 23) /3. 
tt + Por + Pos — — Par — Diz) /3- 
TABLE III 
New collineations for generating [3*.*,1]’ 

= we, 7, —s* 
= —8 WD, L, 
= Dor — Pas — Px — prs 
Por — t — — Pn Pos 
= Pos Pos — t Pos Pos — 
P's = WP 23 —Patt — Po — Pu — Pa 
= Pa Pa +t — w* — Por Pas 
= Pi +t Pr Piz — 
= — wey k (2 + — 23) — wz, — 2; we, 
— we, k (2, + 2 + 25) — wz, — 2, 
— Kk (2 + 2%, — 22) — 
2, = — wz, k + 2, — 23) 


or 


| 

6 
7 
8 
9, 
10. 
12. 


or 


o 


THE POLYTOPE 2.,. 485 


TABLE IV 


Two ways of generating the group [3*,%,7], of order 696729600 


N O P P, Q 
~ * ~ ~ ~ 

Zs 2e 2, 2s 

&3 p &3 @3 we, 


* p = — 2 + 23 + — wk, + w2,) /3. 


UNIVERSITY OF TORONTO. 


REFERENCES. 


. H. F, Baker, Principles of Geometry, vol. 6, Cambridge, 1933. 
. W. W. Rouse Ball, Mathematical Recreations and Essays, eleventh edition, London, 


1939. 


. H. F. Blichfeldt, Finite Collineation Groups, Chicago, 1917. 
. H. R. Brahana, “ Pairs of generators of the known simple groups whose orders are 


less than one million,” Annals of Mathematics, vol. 31 (1930), pp. 529-549. 


. R. Brauer and C. Nesbitt, “On the modular representations of groups of finite 


order,’ University of Toronto Studies (Mathematical Series, No. 4, 1937). 


. H. Burkhardt, “ Untersuchungen aus dem Gebiete der hyperelliptischen Modul- 


functionen, III,” Mathematische Annalen, vol. 41 (1893), pp. 313-343. 


. W. Burnside, “ The determination of all groups of rational linear substitutions of 


finite order which contain the symmetric group in the variables,” Proceedings 
of the London Mathematical Society (2), vol. 10 (1912), pp. 284-308. 


. H. 8. M. Coxeter, “ The pure Archimedean polytopes in six and seven dimensions,” 


Proceedings of the Cambridge Philosophical Society, vol. 24 (1928), pp. 1-9. 


.——, “The polytopes with regular-prismatic vertex figures,” Part I, Philo- 


sophical Transactions of the Royal Society of London (A), vol. 229 (1930), 
pp. 329-425. 

——, Part 2, Proceedings of the London Mathematical Society (2), vol. 34 
(1932), pp. 126-189. 


. ———-, “Discrete groups generated by reflections,” Annals of Mathematics, vol. 35 


(1934), pp. 588-621. 

——~, “ Finite groups generated by reflections, and their subgroups generated by 
reflections,’ Proceedings of the Cambridge Philosophical Society, vol. 30 
(1935), pp. 466-482. 


- ——, “ Wythoff’s construction for uniform polytopes,” Proceedings of the Lon- 


don Mathematical Society (2), vol. 38 (1935), pp. 327-339. 


= 
= 
= | 
= 
23) /3. 
] 
2 
3 
4 


486 


15. 


22 


23. 
24. 


2 


H. S. M. COXETER. 


L. E. Dickson, Linear Groups, with an exposition of the Galois Field Theory, 
Leipzig, 1901. * 

P. Du Val, “On the directrices of a set of points in a plane,” Proceedings of the 
London Mathematical Society (2), vol. 35 (1932), pp. 23-74. 

E. L. Elte, The Semi-regular Polytopes of the Hyperspaces, Groningen, 1912. 

J. S. Frame, “The simple group of order 25920,” Duke Mathematical Journal, 
vol. 2 (1936), pp. 477-484. 

-, “A symmetric representation of the twenty-seven lines on a cubic 
surface by lines in a finite geometry,” Bulletin of the American Mathe- 
matical Society, vol. 44 (1938), pp. 658-661. 

T. Gosset, “On the regular and semi-regular figures in space of n dimensions,” 
Messenger of Mathematics, vol. 29 (1900), pp. 43-48. 

A. Henderson, The Twenty-seven Lines upon the Cubic Surface, Cambridge, 191]. 

H. Maschke, “ Aufstellung des vollen Formensystems einer quaterniiren Gruppe 
von 51840 linearen Substitutionen,”’ Mathematische Annalen, vol. 33 (1888), 
pp. 317-344. 

L. Schlatli, “ An attempt to determine the twenty-seven lines upon a surface of 
the third order, and to divide such surfaces into species in reference to the 
reality of the lines upon the surface,” Quarterly Journal of Mathematics, 
vol. 2 (1858), pp. 110-120. 

P, H. Schoute, Mehrdimensionale Geometrie, vol. 1, Leipzig, 1902. 

, “On the relation between the vertices of a definite six-dimensional polytope 

and the lines of a cubie surface,” Koninklijke Akademie van Wetenschappen 

te Amsterdam, Proceedings of the Section of Sciences, vol. 13 (1910), pp. 

375-383. 

H. Seifert and W. Threlfall, Lehrbuch der Topologie, Leipzig, 1934. 

D. M. Y. Sommerville, An Introduction to the Geometry of n dimensions, London, 
1929. 

J. Steiner, “ Uber die Flichen dritten Grades,” Journal fiir die reine und ange- 
wandte Mathematik, vol. 53 (1857), pp. 133-141. 

J. A. Todd, “ Polytopes associated with the general cubic surface,” Journal of the 
London Mathematical Society, vol. 7 (1932), pp. 200-205. 

A. Witting, Ueber eine der Hesse’schen Configuration der ebenen Curve dritter 
Ordnung analoge Configuration im Raume, auf welche die Transformations- 
theorie der hyperelliptischen Functionen (p= 2) fiihrt, Dresden, 1887. 


14. 
= 
16. 
17. 
18. 
19. 
( 
20. 
21. ( 
( 
( 
25. ( 
26. ( 
( 
27. 
] 
28. 
( 
a 
( 
( 
q 


LE MOUVEMENT BROWNIEN PLAN.* 


Par M. Paut Lévy. 


Introduction. Nous nous proposons d’étudier les fonctions aléatoires 
V(t), Y(t) qui, dans le mouvement brownien, et en projection sur un plan, 
définissent les coordonnées du centre d’une molécule considérée indépendam- 
ment des autres. I] s’agira seulement du mouvement brownien mathématique, 
dont nous donnerons plus loin la définition précise. Indiquons tout de suite 
quwil se distingue du mouvement brownien réel parce que le libre parcours 
moyen est supposé réduit a zéro. On est ainsi conduit a étudier des fonctions 
continues d’allure excessivement irréguliéres: elles présentent, dans tout in- 
tervalle, une infinité de maxima et minima; leur quatre nombre dérivés sont 
infinis, sauf sur un ensemble de mesure nulle ott un de ces nombres est fini 
et sur un ensemble dénombrable ot deux sont finis; il y en a en tout point 
au moins deux qui sont infinis. I] s’agit 1a bien entendu de propriétés presque 
sires, c’est-d-dire réalisées avec une probabilité unité, mais pouvant étre en 
défaut dans des cas possibles, quoique de probabilité nulle; il nous arrivera 
de ne pas rappeler la nécessité de cette restriction; il nous semble que, dans 
des questions ot il est bien entendu qu'il s’agit de phénoménes aléatoires, 
il ne peut en résulter aucune ambiguité. 

L’indépendance stochastique de A(t) et Y(t) a conduit les mathé- 
maticiens 4 étudier d’abord le mouvement brownien linéaire (c’est-a-dire 
projeté sur une droite). Cette étude a pris naissance dans les travaux in- 
dépendants les uns des autres de Bachelier ! et de N. Wiener,” et de nombreux 
travaux lui ont été consacrés depuis quelques années. Le lecteur pourra 
trouver un exposé des principes fondamentaux de la théorie ainsi édifiée dans 


* Received October 17, 1939. 

1 Bachelier, Calcul des probabilités (1912). A cette date, Bachelier apparait 
comme un précurseur. Si la maniére dont sont introduits les problémes ot le temps 
joue le réle d’une variable continue laisse 4 désirer, il n’en reste pas moins que c’est 
dans cet ouvrage que l’on trouve pour la premiére fois l’idée que la loi de Gauss 
sintroduit nécessairement comme conséquence de la continuité d’un processus additif, 
et la relation entre ce processus et l’équation de la chaleur. I] faut aussi signaler 
plusieurs formules relatives & l’écart maximum, et peut-étre la formule (que j’ai 
cherchée en vain dans un grand nombre d’ouvrages antérieurs) qui, dans le cas des 
lois absolument continues, définit la loi dont dépend la somme de deux variables 
aléatoires indépendentes. 

*N. Wiener, Differential Space (Publications of the Massachusett’s Institute of 
Technology, Ser. If, N° 60, juin 1923). 

487 


488 M. PAUL LEVY. 


notre Théorie de Vaddition des variables aléatoires (1937), p. 166 a 173. 
Des résultats nouveaux ont été indiqués dans notre travail récent sur quelques 
processus stochastiques homogénes.* Nous rappellerons au §1 du _ présent 
travail Ja définition mathématique du mouvement brownien, c’est-d-dire |a 
définition stochastique de Y(t), et quelques propriétés connues de ce mouve- 
ment utiles pour la suite. Quant aux théorémes généraux du calcul (les 
probabilités que nous considérerons comme connus, ils se trouvent tous dans 
notre ouvrage de 1937 cité ci-dessus; nous prions le lecteur de s’y reporter 
en cas de besoin. 

Les § 2 et 3, concernant respectivement le mouvement brownien linéaire 
et le mouvement plan, sont consacrés 4 l’étude de quelques propriétés locales 
des trajectoires. Les § 4 et 5 sont consacrés respectivement a l’étude d’une 
expression qu’on peut représenter symboliquement par l’intégrale 


B= 


et que nous appelons V’oscillation brownienne de X(t) dans l’intervalle (0, 1). 
et a celle de Vintégrale 


— 


qui, avec des axes convenablement choisis, représente l’aire comprise entre 
la courbe C trajectoire du mouvement brownien plan pendant l’intervalle 
de temps (0.1), et sa corde. 

Il s’agit 1a d’intégrales stochastiques d’un type essentiellement nouveau. 
On peut les définir comme limites de sommes, ce qui, pour l’aire S, revient 
& la considérer comme limite de celle définie en remplagant le courbe (’ par 
une ligne polygonale inscrite; mais il ne s’agit pas d’une limite ordinaire; 
suivant les cas il s’agira de convergence en probabilité, ou de convergence 
en moyenne quadratique (qui, comme on sait, implique la précédente), ou de 
convergence presque siire. Cette derniére notion ne peut d’ailleurs inter- 
venir que comme conséquence d’une hypothése restrictive relative au mode 
de division de l’intervalle de variation de ¢ en intervalles trés nombreux et 
trés petits; il nous suffira de supposer que tout point de division une fois 
choisi soit conservé dans les subdivisions ultérieures. 

Le hasard peut d’ailleurs intervenir, d’une part dans le choix des points 
de division, d’autre part dans celui des fonctions Y(t) et Y(t). Le point 


8 Compositio Mathematica, vol. 7 (1939), pp.’283-339. Ce travail sera désigné dans 
la suite par l’abréviation “ processus,” et notre livre cité dans le texte sera désigné 
par “Var. aléatoires.” 


( 

( 

i ( 

i 

i 

hi j 

| 

| 
j 


LE MOUVEMENT BROWNIEN PLAN. 489 


de vue le plus simple consiste a fixer un mode de division de l’intervalle 
dintégration, et 4 considérer X(t) et Y(t) comme aléatoires, et démontrer 
dans ces conditions l’existence des limites B et 8; c’est ce que nous ferons 
au début de chacun des §4 et 5. Mais nous envisagerons ensuite le point 
de vue inverse; c’est de ce point de vue que nous considérons qu’il introduit 
en analyse une notion d’une nature toute nouvelle. Pour chaque détermina- 
tion, soit de A(7), soit de la courbe C, les points de division étant choisis 
au hasard, il peut arriver que les sommes utilisées pour définir B et S con- 
vergent presque siirement vers des limites, et cela sans qu’on puisse conclure 
i existence de l’intégrale au sens ordinaire, c’est-a-dire d’une limite in- 
dépendante du mode de division de Vintervalle d’intégration et existant dans 
tous les cas. I] s’agit la de propriétés non aléatoires de chaque fonction .V (7) 
ou de chaque courbe C. Le résultat peut-étre le plus important de ce travail 
est que, dans le schéma aléatoire du mouvement brownien mathématique, 
on obtient presque sirement des trajectoires ayant ces propriétés. Quant 
ala nature de la limite, dans le cas de B, elle n’est pas aléatoire; on a B= 1. 
Dans le cas de S, c’est une nouvelle variable aléatoire dont nous avons cherché 
4 définir la nature; c’est Vobjet de la fin du § 5 (5° a 12° de ce paragraphe). 
Il ne semble pas que la fonction de répartition de 8 soit susceptible d’une 
expression simple; nous donnons plusieurs résultats relatifs a cette fonction, 
dont le plus important est peut-étre le suivant: la détermination de la loi a 
deux variables S et L (L étant la longueur de la corde sous-tendant Vare C) 
dépend d’une équation aux dérivées partielles du second ordre et du type 
elliptique, vérifiée par une des dérivées de la fonction de répartition, et qui, 
compte tenu de ce quwil s’agit d’une fonction de répartition, la détermine 
compléetement. 

Observons, en ce qui concerne B, que le fait que le hasard réalise avec 
une probabilité unité des fonctions pour lesquelles cette intégrale existe, 
nimplique pas qu’il soit facile de nommer une telle fonction. Le § 4 contient, 
sur ce sujet, quelques remarques qu’il peut étre utile de compléter et préciser. 
On pourrait étudier aussi V’aire S au méme point de vue, en cherchant a 
nommer une courbe pour laquelle l’aire existe au point de vue stochastique, 
mais non au point de vue de l’analyse ordinaire. 

Au §6, nous montrerons que la courbe C' est, avec une probabilité unité, 
un ensemble de mesure superficielle nulle. Pourtant le fait que B soit positif 
implique que la courbe fasse assez de détours infiniment petits pour pouvoir 
remplir une aire. Mais, pour qu’elle remplisse effectivement une aire, il 
faudrait une organisation de ces détours infiniment petits que le hasard n’a 
aucune chance de produire. 


3 


490 M. PAUL LEVY. 


Cette remarque s’étend 4 d’autres schémas aléatoires que celui du mouve- 
ment brownien. Au § 7%, nous étudions quelques exemples de tels schémas. 
Tous vérifient cette condition que la courbe [T décrite par le point mobile 
quand le paramétre ¢ varie de zéro a un est composée de deux arcs stochastique- 
ment semblables 4 la courbe compléte; nous entendons par la que n’importe 
quel ensemble de courbes qui sont des courbes [ possibles, est aussi, 4 une 
similitude prés, un ensemble de formes possibles de chacun de ces arcs, et 
cela avec la méme probabilité que pour T. Cette propriété, dans le cas du 
mouvement brownien, est réalisée pour l’are correspondant a n’importe quel 
intervalle de variation de t, et le rapport de similitude stochastique est la 
racine carrée de la longueur de cet intervalle. Dans le cas des schémas 
étudiés au § 7, cette propriété n’appartient qu’aux arcs correspondant aux 
intervalles de variation de ¢ compris entre deux multiples consécutifs de 2-", 
de sorte que l’ensemble des valeurs de ¢ qui sont des fractions dyadiques joue, 
dans l’étude de ces courbes, un role tout a fait remarquable. Le rapport de 
similitude stochastique, suivant les schémas étudiés, est déterminé pour chacun 
des arcs partiels considérés, ou au contraire aléatoire; mais dans ce cas les 
rapports relatifs aux différents arcs ne sont pas indépendants. ‘ Toutes les 
fois que la somme des carrés de ces rapports, étendue a n’importe quelle 
division de la courbe en arcs stochastiquement semblables a.la courbe entiére, 
est égale 4 l’unité, on peut dire, comme dans le cas du mouvement brownien, 
que la courbe fait assez de détours infiniment petits pour pouvoir remplir 
une aire. I], peut alors s’agir de courbes non aléatoires, composées de parties 
semblables au tout, et notamment de deux courbes bien connues qui seront 
désignées par Ty et T,, et qui remplissent effectivement des aires. Mais, 
toutes les fois que le hasard joue un role suffisant dans la définition de la 
courbe, pour les mémes raisons que dans le cas du mouvement brownien, 
il est infiniment peu probable que la courbe remplisse effectivement une aire. 

Nous étudions aussi, pour ces schémas, l’aire S comprise entre l’arc et 
la courbe. Dans les cas ot nous venons d’indiquer que la courbe pourrait 
i premiére vue remplir une aire, mais n’a aucune chance de le faire effective- 
ment, cette aire se présente sous la forme d’une-somme & + sy d’aires triangu- 
laires, la série Xsy? étant convergente. Si alors les signes sont choisis au 
hasard, la série qui définit S est presque siirement convergente; il est presque 
sir que la courbe étudiée limite une aire définie au sens indiqué a propos 
du mouvement brownien. Si au contraire tous les termes sont positifs, ou sl 
(ce qui sera le cas pour les schémas qui seront désignés par les notations 
Tr, et I’.) ils sont groupés en groupes étendus de termes de méme signe, 
la série considérée est divergente; S apparait alors comme infini ou in- 


| ( 
( 
| ! 
( 
d 
a 
d 
a 
0) 
di 
éq 
m 
ne 
té 
fo 
al 
les 
(0 
de 
| il 
| la 
mc 


LE MOUVEMENT BROWNIEN PLAN. 491 


déterminé. Comme le signe des aires triangulaires + sy peut étre défini par 
une loi précise méme dans des cas ott le hasard intervient par ailleurs dans 
la définition de la courbe, la question de Vexistence de Vaire S n’est pas 
liée & la précédente; du moins Vexistence de V’aire S ne permet aucune con- 
clusion au sujet de la mesure superficielle de V’ensemble des points de la 
courbe. 

Le grand nombre des schémas aléatoires que l’on pourrait ainsi étudier 
et des problémes qui se posent nous a obligé, non seulement a faire un choix, 
mais dans certains cas 4 nous contenter de démonstrations résumées ou méme 
(’énoncés sans démonstrations, et aussi a Gnoncer sans en indiquer la solution 
des probleémes qui nous semblent mériter d’étre étudiés ultérieurement. 

Le plan initial de ce travail comportait un dernier paragraphe consacré 
a des types (intégrales et d’équations différentielles stochastiques qui géné- 


ralisent S, et notamment a lintégrale 


(X,Y, Z)dX(t) +y(X, ¥, Z)d¥ (t) + (X,Y, Z)dZ 


qui définit le travail de Ja particule mobile dans un champ de forces, et a 


’équation aux différentielles totales 
dU (t) = 6(X, Y,7,U)dX(t) +x(X, Y,7,U)dY (t) + 


XY, Y, Z étant les coordonnées d’une particule dans le mouvement brownien 
i trois dimensions, et ¢, x, yw étant des fonctions continues a la Lipschitz. 
On peut définir UV comme limite de sommes ou de solutions d’équations aux 
différences finies. Le point de vue auquel nous envisageons l’étude de ces 
équations est d’ailleurs différent de celui adopté par M. S. Bernstein dans son 
mémoire connu sur les équations différentielles stochastiques en ce sens que 
nous supposons les expériences qui déterminent A, Y, Z effectuées avant lin- 
tégration. L’intégrale U(¢) ayant une valeur initiale donnée sera donc une 
fonctionnelle dépendant d’une maniére non aléatoire des trois fonctions 
aléatoires X, Y, Z. En vertu de propriétés presque siires de ces fonctions. 
les opérations qui aboutissent 4 la définition de U(¢) ont un sens, dans les 
conditions indiquées 4 propos des expressions B et S: il peut exister des modes 
de division de Vintervalle d’intégration en intervalles partiels pour lesquelles 
il n’y ait pas convergence de ces opérations; mais ces modes de division sont 
exceptionnels, 

On peut naturellement généraliser la théorie précédente en prenant pour 
la trajectoire du point VY, Y, Z un schéma aléatoire différent de celui du 
mouvement brownien. 

Les circonstances présentes (en septembre 1939) m’ont décidé a renoncer, 


492 M. PAUL LEVY. 


pour le moment, a la rédaction de cette derniére partie, et a demander 4 la 
direction de American Journal of Mathematics de publier ce travail dans 
son état actuel; je l’en remercie a l’avance.* 

1. Définition de la fonction aléatoire X(t). Nous considérerons une 
variable réelle ¢, variant de zéro 4 Vinfini, et désignerons par X(t) une 
fonction aléatoire de ¢ ayant les caractéres suivants: 1 (0) =0; quels que 
soient et >0, Vaccroissement AX(t) =X(t-+7)—X(t) est une 
variable gaussienne d’écart type Vr; il est de plus stochastiquement indé- 
pendant du passé {c’est-d-dire de l'ensemble des valeurs prises par X(t’) dans 
Pintervalle (0, ¢)]. 

Rappelons comment on peut définir une suite d’expériences aboutissant 
& la détermination d’une fonction aléatoire X(t) ayant effectivement les 
caractéres précédents. Considérons a cet effet trois valeurs to, ¢;, ¢, de t 
(to < t; < ly), et posons 

X’ = X(t,) —X (to), X” = X(t2) 
X = X’+ X” = X(t.) —X (to). 

Les conditions imposées a X(¢) ne sont évidemment compatibles que parce 
que la somme des deux variables gaussiennes X’ et XV”, d’écarts types respectifs 
Vt, — ly et Vt. —t,, est bien une variable gaussienne d’écart type V te — to. 
Dans ces conditions, quand on connait XY (to), on dispose de deux procédés 
stochastiquement équivalents pour déterminer X(t,) et X(t2). On peut, par 
deux expériences indépendantes, déterminer les deux accroissements successifs 
X’ et X”. On peut aussi déterminer d’abord V’accroissement total, puis 
interpoler, c’est-i-dire déterminer X(¢,) d’aprés la loi de probabilité condi- 
tionnelle dont dépend cette variable lorsque X(t) et X(t.) sont connus. 
Dans ces conditions, X(t,) a pour valeur probable le nombre 

(t, —t,)X (ty) +- — to) X (te) 
tz — to 


my 


obtenu par une interpolation linéaire, et Y(t¢,) — m, est une variable gaus- 


siene d’écart type 


=> 


— to) (t2-—t1) 

ty — to 
En particulier, si t; —t) =t,—t,—=7, on a = V 7/2, eest-a-dire que 
la différence entre X(t,) et sa valeur probable est une variable gaussienne 


décart type V2 fois plus petit que quand on connait seulement X(t). 


* Quelques-uns des résultats établis dans ce travail ont été énoncés dans deux 
Notes présentées & l’Académie des Sciences (C. R., t. 207, p. 1152; t. 209, p. 140, 
et erratum, p. 387). 


‘ 
1 
| 

t 

5 

d 

sl 

é 

fi 

| 


LE MOUVEMENT BROWNIEN PLAN. 493 


Ce résultat est en relation évidente avec les propriétés connues de la loi 
de Gauss 4 deux variables, dans le cas isotrope: d’aprés ces propriétés, si 1’ 
et ¥” sont deux variables gaussiennes indépendantes de méme écart type V7, 
X’+X" — X” 
sont deux variables gaussiennes indépendantes d’écart type W7/2; cette 
remarque définit parfaitement la loi dont dépend X’ lorsque Y est connu. 
On peut donc, sans changer la loi de probabilité imposée pour l’ensemble 
des trois variables (to), X(t,), et X (tz), procéder par interpolation, c’est- 
a-dire déterminer \’(/,) seulement aprés la détermination préalable de X (to) 
et (2). Par suite, pour déterminer X(t) dans Vintervalle (0,1), on peut 
déterminer \’(1), puis, par des interpolations successives, déterminer (4), 
puis (+) et X(#), et ainsi de suite. Désignons par X,(¢) la fonction con- 
tinue égale aux valeurs ainsi obtenues quand ¢ est multiple de 2, et variant 
linéairement dans chacun des intervalles compris entre deux multiples con- 
sécutifs de ce nombre; \,(/) peut étre considéré comme la n'*”¢ approximation 
de X(t), et X(¢) peut étre défini comme la limite presque stire de ces 


approximations. On a en effet le théoreéme suivant: 


THEOREME 1. y a une probabilité unité pour que, quand n augmentle 
indéfiniment, X,(t) tende vers une fonction continue X(t), et cela uni- 


formément dans Vintervalie (0,1). 


La démonstration est trés simple. Désignons par 6, le maximum de 
| (t) —Xn(t)|, dans V’intervalle (0,1). C’est le plus grand des modules 
(de 2” variables gaussiennes de méme écart type q"*? (q=1/V2). Compte 
tenu du lemme de Boole, on a done 

anV 

Si l’on prend pour z, la valeur cV 2n log 2, avec ¢ > 1, & est le terme 
général d’une série convergente. D’aprés le lemme de M. Cantelli (ou celui 
de M. Borel, puis-que les 8, sont indépendants), il existe alors presque 
sirement un nombre fini N tel que, pour n > N, on ait 8 < cq" Vn log 2, 
ce qui établit la convergence uniforme presque sfire de X,(¢) vers une limite, 
evidemment continue, V(t), c.q. f. d. 

Bien entendu le nombre WN est aléatoire. Si done la convergence est 
uniforme par rapport 4 X(t) (de zéro 4 un, ou dans n’importe quel intervalle 
fini), elle n’est pas uniforme par rapport au choix de X(t); mais il suffit 

Vécarter des cas de probabilité arbitrairement petite pour qu’elle le devienne. 


M. PAUL LEVY. 


On vérifie aisément que la fonction X(¢) ainsi obtenue vérifie toutes les 
conditions indiquées dans la définition théorique. On peut d’autre part 
arriver au méme résultat en remplacant l’ensemble des fractions dyadiques, 
qui joue un réle essentiel dans les expériences que nous venons de définir, 
par mimporte quel ensemble de valeurs de ¢ dénombrable et partout dense 
dans l’intervalle ot l’on veut définir (7). Le résultat obtenu est stochastique- 
ment indépendant du choix de cet ensemble. 

Rappelons maintenant trois lemmes dont le lecteur trouvera la démon- 
stration dans nos travaux antérieurs | Processus, formules (15) et (20); 
Var. aléatoires, p. 172]. Le premier remonte a Bachelier. 


LeMME 1. Pour z>0, ona 


Pr{ Max X(u) > a} = Prf 


X(t)| > 2} = V2/at f du, 


LEMME 2. Pour x supérieur a la fois a 0 et a, ona 


Pr{ Max X(u) > 2/X(t) =a} = 


Rappelons que Pr{A/B} désigne la probabilité de A dans Vhypothése B. 
PI q J YE 


LemMME 3. FEtant donnés t, > 0 et > 1, il existe presque stirement un 
nombre » > 0 tel que, pourt St, eb on at 


| ¥(¢ +7) —X(t)| < eV 2r log 1/7. 
Si au contraire ¢c <1, la probabilité de Vexistence de y est nulle. 


2. Etude locale de X(f). Nous pouvons nous contenter d’étudier les 
propriétés de X (¢) au voisinage de l’origine; les résultats obtenus s’appliqueront 
évidemment. au voisinage de n’importe quel autre point, soit a droite, soit a 
gauche de ce point; mais bien entendu, si l’on trouve qu’une certaine circon- 
stance est infiniment peu probable au voisinage de n’importe quel point donné 
Wavance, cela n’empéche pas qu’il puisse exister avec une probabilité positive 
des points, impossibles 4 connaitre a l’avance, au voisinage desquels elle 
soit réalisée. 

Commencons par établir un théoréme qui raméne I’étude locale de X(t) 
a l’étude asymptotique de cette fonction, pour ¢ infini. Pour l’énoncer, nous 
poserons 


(2) t—e¥, X(t) = Vig(u), 


de sorte que ¢(w) est, pour chaque valeur de u, une variable gaussienne réduite. 


494 
| 

| 

( 
( 
8 
0 


LE MOUVEMENT BROWNIEN PLAN. 495 


THEOREME 2. La définition stochastique de $(u) est invariante par le 
changement de u en —u (évidemment aussi par celui de u en w+ c). 


En d’autres termes, les propri¢tés stochastiques de sont in- 
variantes par le changement de ¢ en 1/1. 

Considérons d’abord la suite des valeurs ¢, = q" de t, q étant ici un 
nombre quelconque entre zéro et un; posons Y (tn) we X, ae don V ty. De 
Yindépendance des accroissements successifs de on déduit 


(3) pn-1 dn V + 1 

¢, étant une variable gaussienne réduite indépendante de dn, dni1,° * * 5 
(autre part, d’aprés le principe d’interpolation exposé au § 1, X (tn,,) pouvant 
étre déterminé par interpolation entre (0) et X(tn), on a 


(4) = GnVG+¢ nV1—gq, 
¢’n étant une variable gaussienne réduite indépendante de $n, On a 


done les mémes expériences 4 faire pour déterminer la suite des on de gauche 
a droite, ou bien de droite 4 gauche. En d’autres termes la nature stochastique 
de cette suite est invariante par le changement de n en —n (ou en h—n,h 
étant un entier donné). 

Cette symétrie stochastique n’est d’ailleurs pas détruite quand, aprés 
avoir déterminé @(n) (cest-a-dire ¢-,, si on a pris pour q la valeur 1/e) 
pour toutes les valeurs entiéres de n, on effectue des interpolations pour 
déterminer les nombres ¢(n + 3); chaque nombre ¢(n+ 4) a en effet la 
méme correlation avec @(n) et avee d(n+1). Le résultat obtenu n’est 
(ailleurs autre que la suite des ¢, pour ge. En effectuant de nouvelles 
interpolations, on arrivera 4 définir ¢(u) pour les valeurs de wu multiples de 4. 
puis de 4, et ainsi de suite. A chacune de ces opérations, la symétrie est con- 
servée, et l’on aboutit, a la limite, 4 la détermination de ¢(w) par un processus 
stochastique absolument symétrique. Or il équivaut bien a celui par lequel 
hous avons défini ¢(w), puisque, dans la définition de V(t), rien n’empéche 
de choisir, pour les interpolations, les valeurs de i dont les logarithmes sont 
les valeurs de wu qui interviennent dans le procédé de détermination de $(u) 
que nous venons de décrire. Le théoréme 2 est ainsi démontré. 

Il entraine bien simplement de nombreuses conséquences. Ainsi l’en- 
semble des racines de X(¢) n’a presque siirement aucune borne supérieure ; 
cela résulte aisément de ce que, d’aprés le lemme 1 appliqué 4 «—X(u), 
on peut déterminer une fonction f(t,2) supérieure 4 ¢ et telle que, dans 
Vhypothése =a, X¥(u) ait, avec une probabilité donnée au moins 


une racine comprise entre ¢ et f(¢,z). En prenant alors pour 7) un nombre 


496 M. PAUL LEVY. 


arbitraire, et posant tn; =f|t,X(7n)], on obtient une suite de nombres 
aléatoires croissants qui séparent des intervalles dont chacun a une probabilité 
a de contenir une racine de X(¢). Ces probabilités étant indépendantes, 
il résulte de la loi forte des grands nombres, sous sa forme la plus classique, 
quwil y a presque siirement une infinité d’intervalles (tn, 741), ayant une 
fréquence tendant vers a, contenant chacun au moins une racine de X(t). 

I] résulte alors du théoréme 2 que les racines positives de X(t) ne sont 
pas non plus bornées inférieurement: il esl presque sir que la racine zéro de 
X(t) nest pas isolée. 

Bien entendu, comme toutes les fois que lon applique une méthode de 
transformation, on peut transformer, non seulement le théoréme, mais sa 
démonstration, et obtenir ainsi une démonstration indépendante de la trans- 
formation employée. Dans le cas qui nous occupe, c’est ce qui donne a la 
fois la démonstration la plus simple du résultat obtenu, et-celle qui permet 
le mieux son extension a d’autres types de fonctions aléatoires. 

L’ensemble des racines de A'(/) étant fermé, et ne comprenant aucun 
intervalle, il existe bien entendu des racines isolées 4 gauche, ou 4 droite; 
elles forment au plus un ensemble dénombrable. Si nous définissons une 
racine 6 par la condition d’étre la plus petite racine au moins égale 4 un 
nombre donné lo, sauf peut-étre dans V’hypothése infiniment peu probable 
X (to) = 0, elle est isolée 4 gauche; ce renseignement ne modifiant en rien 
Vallure probable de pour > il est presque sfir que est, comme 
zéro, une racine non isolée 4 droite. Or il y a au plus une infinité dénombrable 
de racines isolées 4 gauche, done d’occasions de trouver une racine isolée 4 
la fois 4 gauche et 4 droite, et chaque fois la probabilité de cette circonstance 
est nulle. La probabilité totale est donc nulle: i n’y a presque stirement 
aucune racine de X(t) isolée a la fois a gauche et a droite. 

Naturellement, ce résultat s’applique aux racines de si est 
donné. Pourtant X(/) a dans tout intervalle une infinité dénombrable de 
maxima et minima; mais l’ensemble des valeurs maxima ou minima de Y(t) 
n’a aucune chance de contenir une valeur 2 donnée d’avance. 

Pour une étude plus compléte de l’ensemble des racines de X(t), le 
lecteur peut se reporter 4 notre mémoire antérieur (Processus, § 7). 

Indiquons une autre application du théoréme 2. M. Khintchine a établi 
le théoréme du logarithme itéré, d’aprés lequel 
(5) Pr } tim sup 

lul>co V2 log | wu | 
Il l’a démontré successivement pour wu tendant vers + o, et pour wu tendant 
vers — co (donc ¢ vers zéro). D’aprés le théoréme 2, un de ces résultats 


entraine l’autre. 


if 
é 
a 
I 
i r 
q 
d 
8 
i 
‘ 


LE MOUVEMENT BROWNIEN PLAN. 497 


Nous nous proposons maintenant, en supposant toujours, pour fixer les 
idées, g < 1, d’étudier quelques propriétés de la suite des nombres Xn = dn V tn. 
D’aprés la relation de récurrence (4), ils forment une chaine de Markoff 
simple et homogeéne dans le temps (c’est-d-dire stalionnaire; mais ce terme, 
généralement employé, nous parait impropre). Nous pourrions renvoyer le 
lecteur & la théorie générale de ces chaines; mais l’étude précise d’un cas 
particulier n’est peut-étre pas inutile. 

Indiquons d’abord la formule 


(5’) ry lim sup =1 - 1 


? 
no 2logn 


analogue a la formule (5). Il nous suffira, pour la suite, de savoir que la 
limite considérée ne saurait dépasser l’unité, c’est-a-dire que, si ¢ > 1, il existe 
presque stirement un nombre WN tel que, pour n > N, on ait 


(6) | gn | < eV 2 log n. 


Cela résulte immédiatement de ce que la probabilité de l’inégalité inverse, 


égale a 


26° ojo Udu 1 
7 SJ cv2logn J ov2logn cV2logn cn*Vrlogn 


est le terme général d’une série convergente. 


Démontrons maintenant que la loi forte des grands nombres s’applique 
4 la suite des | |, c’est-d-dire que: 


THEOREME 3. La fréquence des dn inférieurs & un nombre donné «x tend 
presque stirement (et cela d’une maniére uniforme) vers la probabilité théorique 


Pr{on << = F(x) = = da, 
V 2a 7-00 

Nous montrerons que la différence entre cette fréquence et F(a) devient 
presque sirement inférieure en valeur absolue 4 un nombre donné arbitraire- 
ment petit que nous désignerons par 3e. I] suffit évidemment d’établir le 
résultat analogue 4 celui énoncé, mais ne faisant intervenir que les valeurs 
des ¢, pour les valeurs de n de la forme my + vp (v= 0,1,2,: * -); p est un 
entier que rien n’empéche de choisir en fonction de ¢ et aussi grand qu’il est 
nécessaire, et m) est un quelconque des nombres 1, 2,:--,p. L’idée directrice 
de la démonstration est que, si p est assez grand, dn et dnp sont presque 
indépendants, et qu’on peut appliquer la loi forte des grands nombres comme 
Sil s’agissait d’une suite de variables aléatoires indépendantes. 


498 M. PAUL LEVY. 


D’apres la formule (4), que l’on peut appliquer a l’étude de la corrélation 


entre , et dnsn & condition de remplacer q par q", on a 


(7) Pnsh dn V + 1 

¢tant une variable gaussienne réduite, indépendante de La corrélation 
entre dp et dasn est bien, d’aprés cela, d’autant plus faible que h est plus grand. 
Kn désignant par ¢ un nombre positif donné, nous choisirons h assez grand 


pour que 


(8) oVq' = €, Vi—q2l—e 
Si alors | ¢, | = ¢, il résulte de la formule (7) que 
(9) | — Yn | Se(1+ | Yn |), 


d’ot Von déduit aisément, du moins si ¢ est assez petit 
(10) | < dn | =o} —F(a)| Se? 


Si yw désigne une variable gaussienne réduite, l’expression 


| 
(11) = Max [ (1 + lvl), 


ott Max [a,b] désigne le plus grand des nombres a et 6, est une variable 
aléatoire 4 valeur probable finie. En désignant sa fonction de répartition par 


G(), on peut done prendre pour p un entier tel que 


od G(o) €, 
Dp 


et, en désignant par H le plus petit multiple de p au moins égal a Q, sa 


valeur probable est 
Dp oO 
f HdG(w) paG(o) + f (p + 0) dG(w) < p+e 
p 


Désignons alors par yo, °°, Wx,° des déterminations inde- 
pendantes de y, et, par H;,, la détermination de H qui correspond a 


log (avec 


5Tl n’y a qu’a utiliser cette remarque évidente que, si 
yw’ et |v” | et si, f(v) étant la densité de probabilité de y, 
a | et f(x) < on commet sur la fonction de répartition une erreur 
au plus égale 4 €,m, + €m, en remplacant y par y”. Dans le cas de la loi de Gauss, 
my = si est assez petit, on peut prendre e,=—e et = —1)e; la 
24 
formule (10) en résulte. 


| 
5 
€. 
| 
mit 
| 
( 
f 
U 
F 
i 
| 


n 


LE MOUVEMENT BROWNIEN PLAN. 499 


D’aprés la loi forte des grands nombres (qui s’applique 4 H, d’aprés un 
théoreme de A. Kolmogoroff), on a presque sfirement 


k 


(12) lim =€{H} <p+e. 


Considérons alors la suite des nombres 
No = No, J No + Ho, Nuss Ni + 


quel que soit //o, ils comprennent la plupart des termes de la progression 
arithmétique 


No, No + p, No + +, 


la fréquence des termes qui manquent étant presque sirement, a partir d’un 
certain moment, inférieure a ¢/(p +e), donc a «. On peut donc ne considérer 
que les valeurs de n de la forme N;, et il suffit de démontrer que, parmi ces 
valeurs, la fréquence de celles pour lesquelles ¢, < « differe de F(a) dau 
plus 2e. 

Nous allons montrer a cet effet que, ¢’, désignant la détermination de 
gn pour n = Nx, on peut appliquer la formule (8) pour ¢ = | ¢’, | et h = Hy, 
et par suite aussi la formule (9), dans laquelle on peut évidemment identifier 


vn avec la variable gaussienne désignée ci dessus par y’;; elle s’écrit done 
| — | Se(1 + | |). 

I] n’y a qu’a procéder par récurrence. La définition de Hy étant restée 
arbitraire, nous pouvons supposer ce nombre assez grand pour que le résultat 
énoncé soit vrai pour k =. Nous pouvons alors le supposer vrai pour une 
certaine valeur k. I] résulte dans ces conditions de la formule (9), et de la 
définition de (d’aprés laquelle H = Q), que a 

Cest-a-dire 
La nouvelle application de la formule (8), et par suite celle de la formule 
(9) pour la valeur & + 1, sont done justifiées, 

La formule (10), conséquence des formules (8) et (9), s’applique donc 
aussi pour n + h = Ny + Hy = Guin = | On | |. La 
fonction de répartition conditionnelle dont dépend ¢’x.1, lorsque $% est connu 
différe donc de F(x) d’au plus e¢, et, d’aprés la loi forte des grands nombres 
relative aux variables enchainées, la fréquence des valeurs inférieures a 2 


parmi les nombres ¢’;, est presque sirement, a partir d’une 


certaine valeur de i, comprise entre — 2e et F(x) + 2c, f. d. 


n 
| 
| 
1 


500 M. PAUL LEVY. 


CoroLialrE 1. La fréquence des valeurs de n pour lesquelles on a 


(13) pn = = y 


tend presque strement, pour n infini, vers la probabilité théorique de ces 
inégalités, déduites de la formule (4), c’est-d-dire vers Vexpression 


Indiquons seulement le principe de la démonstration, dont le lecteur 
reconstituera aisément les détails. Chaque petit intervalle de valeurs possibles 
pour ¢y est réalisé pour les valeurs 1,2,---,n de v avec une fréquence peu 
différente, si n est grand, de sa probabilité théorique; donc un grand nombre 
de fois. On peut donc appliquer de nouveau la loi forte des grands nombres 
et conclure que ces valeurs de qv entrainent les différentes valeurs possibles 
pour ¢y,; avec des fréquences peu différentes de leur probabilité théorique. 

Ce corollaire s’étend sans difficulté au cas ot l’on considére simultanément 
un nombre quelconque de termes consécutifs de la suite des ¢n. 


CoroLLAIRE 2. La fréquence des changements de signes dans la suite des 


dn tend presque stirement vers (1/7) Arc tg V (1— q)/q (donc vers 4, si g = $). 
Ce corollaire est évidemment un cas particulier du corollaire 1. 


CoROLLAIRE 3. La fréquence des valeurs de n pour lesquels Vintervalle 
(tns1y tn) contient au moins une racine de X(t) tend presque stirement vers 
(2/r) Arc tgV (1—q)/4. 

Ce corollaire résulte immédiatement du Corollaire 1, et de la loi forte des 
grands nombres. Si la suite des X, est supposée connue, la fonction X (t) 
devant étre déterminée ensuite par des interpolations dans chacun des inter- 
valles (tn41, tn), la fréquence des intervalles contenant au moins une racine 
est presque sfirement infiniment peu différente de la moyenne des probabilités 
conditionnelles théoriques. Rappelons que, pour chacun de ces intervalles, 
cette probabilité conditionnelle, évidemment égale 4 un pour XpXn. = 9, 
est dans le cas contraire, d’aprés le lemme 2, 


(- ) (— ). 
n n+1 V1 | 


Il résulte évidemment du corollaire 1 que cette probabilité conditionnelle 
est presque siirement convergente en moyenne arithmétique vers sa valeur 
probable. La fréquence des intervalles (tn.1,¢n) qui contiennent au moins 
une racine converge donc elle-méme presque stirement vers cette valeur prob- 


i 
ig 


LE MOUVEMENT BROWNIEN PLAN. 501 


able, c’est-a-dire vers la probabilité a priori de l’existence d’une racine dans 
Vintervalle (¢n41, quand on ne connait ni Xn, ni Xn. Cette probabilité 
a la valeur connue (2/7) Are tgV (1—q)/q [Processus, formules (42) et 
(44.)°], ce qui termine la démonstration du corollaire 3. 


3. Le mouvement brownien plan. 1°. Ce mouvement est celui d’un 
point A(t) dont les coordonnées rectangulaires X(t) et Y(t) sont deux 
fonctions aléatoires du type que nous venons d’étudier, indépendantes l’une 
de ’autre. Pour chaque valeur du paramétre ¢, que nous appellerons la cote 
de A(t), ce point dépend de la loi de Gauss isotrope, d’écart type Vt; nous 
appelons ici écart type, non la valeur quadratique moyenne de la distance 
R(t) du point A(¢) a Vorigine 0, mais la valeur quadratique moyenne com- 
mune de X(¢) et Y(/). On sait que R(t) dépend de la loi défini par 
(15) Pr{ R(t) > r} 

Les propriétés du vecteur A(t)A(t-+ 7), déplacement du point mobile 
pendant l’intervalle de temps (t,¢-+ 7), sont naturellement indépendantes de 
t, et du passé; ce vecteur dépend de la loi de Gauss isotrope d’écart type Vr. 

De nombreuses propriétés du mouvement brownien plan sont des con- 
séquences si évidentes des propriétés correspondantes du mouvement brownien 
linéaire qu’il suffit de les énoncer. Tel est le cas du principe d’interpolation : 
si ly << t, < to, et si A(to) et A(tz) sont connus, la position probable de 
A(t,) est le point M, obtenu par une interpolation linéaire, et M,A(t,) 
dépend de la loi de Gauss isotrope d’écart type 


— 
t, — lo 


2°. Pour étudier la forme de la courbe au voisinage de lorigine 0, nous 
considérerons toujours la suite des valeurs t, =qg" del (O0<q<1). Nous 
écrirons A, par abréviation de A(t,), et désignerons par M, la position pro- 
bable de ce point quand 0 et A, sont connus; c’est le point défini par la 


formule vectorielle 


OM, qoA 


On remarque que OM, et M,An sont deux vecteurs, indépendants lun de 
autre, et dépendant respectivement des mémes lois que OAne et Ans An; les 
propriétés stochastiques des deux triangles OMnAn et OAniwAn sont done 
identiques (si g = 4, on peut écrire indifféremment 0M,A» ou A,»M,0). 


*Ces deux formules, que nous avons établies séparément, sont équivalentes; dans 
Pune, l’intervalle que nous désignons ici par (t,,,,¢,), est désigné par (t,t +4); 
dans l’autre, il est désigné par (t —u, t). 


502 M. PAUL LEVY. 


L’angle 0 de chacun de ces triangles, orienté et compté de —z a +7, 
est une variable aléatoire 6 dont la loi ne dépend que de q; elle est symétrique; 
la valeur quadratique moyenne de @ est un nombre positif o—o(q), in- 
férieur a 7/2. 

3°. Nous désignerons par RP, = prV tn et ©, les coordonnées polaires 
de An; nous prenons pour ©, la détermination obtenue en supposant 
®, | = 7 et en considérant les A, comme des positions successives d’un point 


mobile qui se déplace sur la ligne polygonale A9A,Az - - - ; chacun des accrois- 
sements @n,; —@, 9, est done une variable aléatoire du type que nous 
venons de considérer.’ Mais les différents 6, ne sont pas indépendants. Si 
Von détermine successivement les points A, dans l’ordre des n croissants, la loi 
conditionnelle dont dépend 6, lorsque le passé est connu ne dépend que de F,; 
elle est toujours symétrique et comporte une valeur quadratique moyenne o’, 
fonction de R, et q; la valeur probable a priori de o’»” est naturellement o”. 

En ce qui concerne les Rn, on établit aisément le théoréme suivant, 
analogue au théoréme 3 relatif au mouvement brownien linéaire: 


THtorEME 4. La fréquence des valeurs supérieures a r dans la suite des 


pn tend presque strement vers la limite 
Pr{pn ~ r} Pr{h, > rV tn} == 

Nous ne développerons pas la démonstration, tout 4 fait analogue 4 celle 
du théoréme 3. On peut d’ailleurs aussi, en tenant compte de l’indépendance 
de X(t) et de Y(t), le considérer comme un corollaire du théoréme 3. 

Indiquons aussi, en ce qui concerne les Ry, la formule 


(16) Pr sup = 1 


2 log n 
analogue a la formule (5’). 
Comme il est évident que donc pra=|¢n|, la borne 
inférieure donnée pour | ¢, | par la formule (5’) s’applique a pn». Pour 
établir la formule (16), il reste 4 montrer que, pour c > 1, il existe presque 


stirement un N tel que, pour n > N, on ait 


(17) pn <cV 2 log n. 


Cela résulte évidemment de ce que la probabilité de V’inégalité inverse, n°, 


est le terme général d’une série convergente. 


7Tl y a indétermination dans la valeur 4 prendre pour 9, si l’un des | 6, | 
(»=0,1,- --,n—1) a la valeur 7, c’est-a-dire si la ligne A,A,- --A, passe en 0. 
La probabilité de cette circonstance étant nulle, il n’en résulte aucune difficulté. 


H 

if 
| 
é 
i 
( 
if 

| 
‘i 


LE MOUVEMENT BROWNIEN PLAN. 503 


On peut aussi déduire la formule (17) de la formule (6), et de la 
remarque que l’angle ©), dont dépend V’orientation de la courbe autour de 
lorigine, est choisi au hasard et indépendant de la suite des pr. Si alors on 
pouvait trouver, avec une probabilité positive, des valeurs de n arbitrairement 
grandes pour les quelles on ait pr = ¢V2logn, il en serait de méme en 
imposant la condition supplémentaire cos ©, > cos a, % ¢tant assez petit pour 
que ccos%=c’ > 1; cela est en contradiction avec la formule (6), écrite 
en remplagant ¢ par c’. 

Les grandes valeurs des pr ne sont done pas plus grandes que celles des 
| dn |; dans un cas comme dans l’autre, on peut appeler grandes valeurs celles 
qui sont supérieures 4 (1—e) V2 log n, ¢ étant un nombre positif donné et 
trés petit. Mais bien entendu les grandes valeurs de pp sont plus fréquentes 
que celles des | ¢, |. Elles correspondent 4 des vecteurs 0A, ayant, 4 un 
angle arbitrairement petit prés, toutes les orientations possibles, et il faut 
choisir ceux qui font avec une direction donnée un angle trés petit ou trés 
voisin de x pour trouver une grande valeur de | dn 


4°. Occupons-nous maintenant des angles 6, et ®,. Remarquons d’abord 
que, comme conséquence du fait que les différentes valeurs possibles pour pa 
sont réalisées avec des fréquences tendant vers leurs probabilités théoriques, 
et de ce que la nature stochastique du triangle 0AnAnj est fonction de pn 
supposé connu, les différentes formes possibles pour ce triangle, et par suite 
les différentes valeurs de 6,, sont aussi presque siirement réalisées avec des 
fréquences tendant vers leurs probabilités théoriques. I] s’agit d’une nouvelle 
application du principe utilisé pour le corollaire 1 du théoréme 3. 

Notons surtout que, o’, étant une fonction (non aléatoire, bornée, et 
continue) de pn, les différentes valeurs de o’» sont réalisées avec des fréquences 
tendant presque sfirement vers leurs probabilités théoriques, et l’on a presque 


sirement 
492 492 

n 
et par suite aussi 
(18’) lim — (0,2 + 0.7 + 0,7) = 


“Cette formule peut étre soit obtenue comme application directe du premier 
alinéa du présent § 3, 4°, soit déduite de la formule (18) et d’une formule que nous 
avons établie antérieurement [Var. aléatoires, formule (22), p. 252] d’aprés laquelle 
il y a presque surement convergence en moyenne arithmétique vers zéro de la suite 


2 


des variables 6? —o’,*. 


t 
t 
i 


504 M. PAUL LEVY. 


Supposons que l’on détermine d’abord les modules, puis les signes des 6), 
Aprés détermination des modules, et en supposant 0A» pris comme orivine 
des angles polaires, ®, se présente sous la forme 


(19) | 6, On|, 

les signes ey étant choisis au hasard indépendamment les uns des autres. 
Comme les | 6, | sont bornés, et que la série 367 est divergente, il résulte 
du second théoréme limite du calcul des probabilités que ©, est asymptotique- 
ment une variable gaussienne. Son écart type, d’aprés (18’), est un infiniment 
grand équivalent A oVn. Donec ®,/oVn est asymptotiquement une variable 
gaussienne réduite. En termes précis: « étant un nombre positif arbitraire- 
ment petit, il existe presque sfirement un nombre WN tel que, pour x quelconque 
et n > N, on ait 

(20) | Pr’{O, < orn} —F(2)| <e, 


Pr’ désignant une probabilité conditionnelle, évaluée en supposant connus 
les | 6, |. 

Le nombre N est aléatoire; mais, en négligeant des cas de probabilité 
inférieure 4 «, on peut lui assigner une borne supérieure non aléatoire 1’. 
Comme Pr est la valeur probable de Pr’, et que, dans les cas ot l’inégalité 
(20) n’est pas vérifiée, son premier membre est du moins au plus égal 4 un, 
on a, pour n > n’ 


(21) | Pr{@n < oxVn} — F(2)| < 


c’est-d-dire que: ®, est asymptotiquement une variable gaussienne; son écart 
type est wn infiniment grand équivalent doVn. 

L’angle ©, apparait ainsi comme le gain d’un joueur dans une partie 
de pile ou face 4 enjeu aléatoire, pouvant dépendre des enjeux antérieurs, 
et déterminé pour chaque coup par une expérience préalable. La formule 
presque stire (18) permet d’assimiler cette partie 4 une partie 4 enjeu fixe o.° 

La plupart des résultats connus relatifs 4 une telle partie s’appliquent 
de méme ici, notamment la loi du logarithme itéré, d’aprés laquelle on a 
presque sfiirement 


On 
(22) lim sup =, 
nao V2n log logn 


le méme résultat s’appliquant 4 —®,. La ligne polygonale A,A2° An’ 
tourne donc indéfiniment (et fort irréguli¢rement) autour du point 0 avant 
de l’atteindre, ®, ayant des valeurs arbitrairement grandes des deux signes. 


°On peut aussi appliquer directement le second théoréme limite du calcul des 
probabilités sous la forme, applicable 4 certaines suites de variables enchainées, qu¢ 


| 
| 
if 
aa ] 
1 | 
J 
( 
U 
i 


LE MOUVEMENT BROWNIEN PLAN. 505 


5°, Désignons par l’angle A,0Ap, compté de —7 a +7. II différe 
de ®, par un multiple de 2z qui, si p > 1, a une probabilité positive de n’étre 
pas nul, Or, dans ce cas, | @’p|<|@),|. La valeur cuadratique moyenne 
de ©’, est donc inférieure a celle de @,, ce qui s’exprime par la formule 


(23) o(q?) < Vpo(q). 


Cette formule permet de comparer les angles polaires obtenus pour un 
méme point Any si l’on va de Ao a ce point, d’une part en suivant le ligne 


polygonale AyA;A2°-*Anp, d’autre part en suivant la ligne raccourcie 
ApApAcp* * * Ann. Les valeurs quadratiques moyennes de ces angles sont 


respectivement Vnpo(q) et Vno(q?). D’aprés la formule (23), la seconde 
est plus petite; cela était 4 prévoir: la seconde ligne évite les détours de la 
premiere. 

Inversement, on peut suivre de plus en plus exactement les détours de la 
courbe C' en prenant pour q des valeurs de plus en plus voisines de l’unité. 
Pour que A, coincide avec le point A(t) correspondant a une valeur de 
donnée entre zéro et un, nous prendrons g=1t'/"; la valeur quadratique 
moyenne de l’angle polaire obtenu pour A(t?) est alors Vno(q). Ilya lien 
de s’attendre 4 ce qu’elle croisse avec n, puisqu’en prenant pour n des valeurs 
ile plus en plus grandes on suit de mieux en mieux les détours de la courbe ; 
on est sfir, par ce qui précéde, qu’elle croit quand on passe d’une valeur initiale 
ny & une valeur n, multiple de mo. Il y a intérét, pour connaitre la nature 
de Yangle polaire @(¢t) obtenu en suivant la courbe elle-méme (angle bien 
défini; il s’agit d’une courbe continue ne passant en général pas par l’origine, 
comme nous le verrons plus loin), de chercher a définir l’expression 


(24) E{@?(t)} —lim no?(t/") = o°(q) 


Nous allons montrer que cette expression est infinie. 


A cet effet, 7, désignant le pied de la perpendiculaire abaissée de A, sur 
Ao, et p et r étant des nombres positifs, considerons l’hypothése 


(E) 0Ao>p, | |<rVv1—q, | 


Si elle est réalisée, quand q tend vers 1, AoA, est petit, et Ao0Ai est un 
infiniment petit équivalent (au sens de Bernoulli) 4 17,A;/0Ao. On en déduit 


(25) lim inf = Pr{ BE € | Odd? 


jai indiquée antérieurement (Var. aléatoires, pp. 237-242). J’ai préféré ici profiter 
de la symétrie des lois dont dépendent les @,, et de l’indépendance des ¢,, pour indiquer 
une démonstration plus simple. 


4 


n? n? 


Ve 
it 

te 
té 
n, 

rt 

ie 

le 

nt 

a 
ag, 
les 
ue 

| 


506 M. PAUL LEVY. 


dans cette formule, €’ désigne une valeur probable calculée dans Vhypothése 
K; en ne tenant compte que des cas oti cette hypothése est vérifiée, on a bien 
une borne inférieure de la valeur probable de 6,7/(1— q) qu’il s’agit d’évaleur, 
Si maintenant p et 1/r tendent vers zéro, les deux premiers facteurs tendent 


vers Punité; le dernier facteur 


ep f 


augmente indéfiniment; il en est donc de méme de o*(q¢)/(1—q), ¢q. f.d. 

On s’explique ais¢ment ce résultat en observant que, si la courbe passe 
trés prés de Vorigine, l’angle polaire varie rapidement. Or, sans avoir une 
probabilité positive de passer exactement a Vorigine, are A(t)A(1) a une 
probabilité positive d’en passer arbitrairement prés; on s’explique ais¢ment 
que cette probabilité soit suffisante pour que la valeur quadratique moyenne 
de @(¢) soit infinie. 

Revenant alors au cas ot g est fixe, considérons la suite des points A,, 
et les valeurs qui leur correspondent de langle polaire Q, = O(t,). Cet 
angle apparait comme une somme de termes aléatoires indépendants dépendant 
d’une méme loi symétrique et 4 valeur quadratique moyenne infinie. On sait 
que dans ces conditions, par leffet de ces grandes valeurs qui se trouvent 
réalisées de temps en temps quand n augmente, l’ordre de grandeur a prévoir 
pour Q, est, si n est grand, supérieur & celui de Wn. Ces grandes valeurs 
de temps en temps réalisées correspondant a des arcs AnAn,i qui passent trés 
prés de Vorigine, on voit que, si 0A, est en général de Vordre de grandeur 
de Vtn, il y a parfois des valeurs plus petites [et aussi des valeurs plus 
grandes, comme Je montre la formule (16)]. Il en résulte que, quand ¢ tend 
vers zéro, 0OA(t), qui doit finalement devenir nul, varie fort irréguli¢rement: 
la courbe a l’aspect d’une succession de boucles qui se ferment de plus en 
plus prés de Vorigine. 

I] peut étre intéressant de préciser d’avantage. Disons seulement que 
la loi 4 valeur quadratique moyenne infinie que nous venons de considérer 4, 
pour toute exposant « < 2, une moyenne d’ordre @ finie; elle appartient au 
domaine d’attraction de la loi de Gauss. Par suite la variation de langle 
polaire ® sur un are A(t’) A(t”), divisée par une fonction convenable de 
’/U", dépend dune loi qui tend vers la loi de Gauss réduite quand ce rapport 


augmente indéfiniment ou tend vers zéro. 


6°. Les résultats qui précedent s’appliquent évidemment l’étude de la 
courbe C au voisinage du point correspondant 4 n’importe quelle valeur donnée 
de t, soit 4 gauche, soit 4 droite de ce point (gauche signifiant ici du cOté des ! 
décroissants; droite, du cété des ¢ croissants). Nous appellerons fangenle 


| 
4 
a] 
i 
| 
} 
i! 


LE MOUVEMENT BROWNIEN PLAN. 507 


en un point une droite passant par ce point et laissant d’un méme coté un 
petit arc de courbe contenant ce point; il peut y avoir des demi-tangentes, 
a gauche, ou 4 droite. Au point A(t) correspondant a une valeur de ¢ donnée, 
ou choisie au hasard par une expérience indépendante du choix de (, il n’y 
a presque stirement ni tangente, ni demi-tangente. 

Par contre, il existe presque sfirement des points exceptionnels, ot il y 
a des tangentes. Nous avons déja observé que tout intervalle de variation 
de ¢ contient une infinité dénombrable de maxima et minima de X(t) ; chacun 
(eux correspond a une tangente paralléle a Vaxe des y. Ce résultat s’appli- 
quant aux tangestes paralléles 4 n’importe quelle direction, il y a une infinité 
continue de tangentes. Il y a d’autre part une double infinité de demi- 
tangentes: toute droite coupant la courbe est une demi-tangente 4 chacune des 
extrémités des intervalles extérieurs 4 la courbe. 

I] n’y a presque stirement aucune tangente double paralléle 4 l’axe des y; 
les maxima et minima de X(¢) constituent en effet une infinité dénombrable, 
de sorte que l’ensemble des valeurs obtenues pour ces maxima et minima n’a 
aucune chance de contenir une valeur qui soit obtenue une seconde fois. La 
méme remarque s’applique aux tangentes doubles paralléles 4 une direction 
donnée. Par contre, comme nous allons le montrer, WU y a presque sirement 
une infinité dénombrable de tangentes doubles (mais la probabilité que l’une 
(elles ait une direction donnée d’avance est nulle). 

Considérons a cet effet deux arcs A(to’)A(ti’) et A(to”)A(L”) de C, 
et désignons respectivement par IY et I” les plus petits contours convexes 
entourant respectivement ces arcs. Il existe de zéro 4 quatre tangentes 
communes a IY et I”; comme A(t’) et A(t,’) sont presque stirement in- 
térieurs a I”, et que A(to”) et A(t,”) sont intérieurs 4 I”, si to’, t1’, to” et 
4,” sont choisis au hasard, ces droites sont presque stirement des tangentes 
doubles a C. 

Sur n’importe quel are de C, nous pouvons choisir deux points distincts 
A(t’) et A(t”), puis prendre —U’, t,’—V, to.’ —Ut”, —U”, assez petits 
pour que I” et I” soient extérieurs lun a l’autre et que les droites que nous 
venons de considérer existent. Pour tout are de C, il y a done des tangentes 
doubles; donc il y en a une infinité dénombrable au moins. 

D’autre part, pour toute tangente double A(t’)A(t”), on peut définir 
dans Vespace représentant l’ensemble des quatre nombres to’, t)’, lo”, 4”, 
un domaine (t’—t,’, to”, positifs et assez petits) 
tel que, pour tout point de ce domaine, A(t’)A(t”) soit une des quatre 
tangentes doubles (au plus) que l’on peut définir en partant de ce point. 
Il ne peut done y avoir qu’une infinité dénombrable de tangentes doubles. 

Désignons par T le plus petit contour convexe entourant un arc 


se 
nt 
d. 
se 
ne 
ne 
nt 
ne 
‘et 
nt 
‘it 
nt 
ir 
I's 
es 
ur 
us 
ul 
ue 
a, 
le 
le 
It 
Ja 

t 
te 


508 M. PAUL LEVY. 


A(t’)A(t”). Nous allons montrer qu’il est presque stirement constitué par 
une infinité dénombrable de tangentes doubles, formant un ensemble partout 
dense autour de I, c’est-d-dire que n’importe quelle tangente 4 I est limite 
de tangentes doubles; il n’y a pas de point anguleux. Kn d’autres termes, 
si M est le point de T d’abscisse curviligne s, la tangente en M aT fait avec 
une direction fixe un angle 0 qui est une fonction de s continue, évidemmen! 
monotone, et a dérivée presque partout nulle. 

La démonstration va résulter de propriétés presque sires de C et de I 
au voisinage du point de © ott x est minimum: la courbure est infinie, mais 
6 varie d’une maniére continue. Ces propriétés doivent de méme étre vérifiées 
au point de contact de la tangente définie par une valeur de 6 choisie au 
hasard ; elles ne peuvent donc étre en défaut que pour des valeurs de 6 ayant 
une probabilité nulle d’étre choisies, c’est-a-dire constituant un ensemble de 
mesure nulle. I] n’en serait pas ainsi s’il y avait sur I des points anguleux, 
ou si l’ensemble des points pour lesquels la courbure est positive et finie 
constituait sur ! un ensemble de mesure linéaire positive. 

Considerons donc le minimum de 2, que nous pouvons évidemment sup- 
poser réalisé 4 l’origine, et pour = 0. II suffit aussi de considérer l’arc voisin 
de ce point correspondant aux valeurs positives de ¢t. Nous avons étudié 
antérieurement les propriétés X (1) dans ces conditions (Processus, § 9, 1° et 
2°): X(t), pour une valeur de ¢ trés petite et choisie au hasard, est en 
général de l’ordre de grandeur de V/¢; les grandes valeurs, fortuitement, mais 
presque sfirement réalisées, sont de l’ordre de grandeur de ¥V 2¢ log | log t |; 
les petites valeurs sont de Vordre de grandeur de Vt | log t |-1*€, ¢ tendant 
vers zéro avec ft. 

D’autre part la variation de Y(t) est indépendante de l’hypothése que 
X(t) soit positif. Il est done presque siir que, dans la suite des valeurs de / 
tendant vers zéro et correspondant soit aux petites valeurs, soit aux grandes 
valeurs, de X(t), on trouve des valeurs arbitrairement grandes de Y(t)/ vi 
(positives ou négatives), mais que Y(t) /V 2t log | log ¢ | est borné (et asymp- 
totiquement = 1). 

De la comparaison des inégalités ainsi obtenues pour X(t) et Y(t) résulte 
dune part que Y(t)/X(t) prend toutes les valeurs possibles entre — o et 
+o, d’autre part que Y?(t)/X(¢) tend vers zéro. L’hypothése d’une 
courbure finie et celle d’un point anguleux sont ainsi exclues, c. q. f. d. 

Indiquons enfin sans démonstration l’extension suivante des résultats 
précédents: dans le mouvement brownien a p dimensions, il est presque stir 
que, pour n’importe quel arc de trajectotre, la plus petite hypersurface convexe 
qut le contienne a son intérieur a un plan tangent bien défini en tout point el 


i 
if 
al 
if 
4 
Bt 
hi 
ii 


LE MOUVEMENT BROWNIEN PLAN. 509 


qui varie dune maniére continue, mais ne comporte aucune partie courbe et 
est constituée par une infinité dénombrable de faces planes. On peut donc, 
dans Vespace considéré, faire varier orientation d’une variété linéaire 4 deux 
dimensions sans que pour aucune d’elles le mouvement projeté sur cette variété 
mette en défaut les propriétés du mouvement brownien que nous venons 
Vobtenir. Ce résultat va beaucoup plus loin que celui qui consiste a dire que, 
pour chaque are de chaque courbe ( considéré isolément, elles ont une proba- 
bilité égale a ’unité. 

4. La notion d’oscillation brownienre. 1°. Nous nous _placerons 
d’abord dans le cas du mouvement linéaire, et supposerons que ¢ varie de zéro 
i un. Supposons cet intervalle divisé en un grand nombre n d’intervalles 
paytiels dont le plus grand ait une longueur trés petite en; désignons par At 
un quelconque de ces intervalles, par AX la variation correspondante de X(t), 
et posons 
(26) by = 3(At)?, By == 

On a évidemment 

€{B,} = = 1 

E€{ —- 1)*} = SE{[ (A)? — At]?} = 3b, — 2b, + dy 
et par suite 
(27) E€{ (By —-1)?} = 2b, S Max Al = 2ey,. 
Si done on fait varier le mode de division de l’intervalle (0,1) en intervalles 
partiels de maniére que en tende vers zéro, il y a convergence en moyenne 
quadratique de By, vers Vunité; done aussi convergence en probabilité. 

Nous allons compléter ce résultat par l’étude de cas ou il y a convergence 


presque siire. 

Désignons par Bb’, la valeur de B, lorsque l’intervalle (0,1) est divisé 
en 2” intervalles égaux et montrons d’abord que: B’, tend presque siirement 
vers Punité. 

C’est une conséquence immédiate de la formule (27), qui s’écrit, dans le 


cas considéré 


E{ — 1)?} = 2? 
L7inégalité de Tchebycheft donne alors 
Pr{| By —1 | = p/20*} < 2/p', 


et, cette expression étant le terme général d’une série convergente, il existe 
presque sirement une valeur de pa partir de laquelle on a 


| Bo —1| < p/2”, 


ce qui démontre et précise le résultat annoncé. 


510 M. PAUL LEVY. 


Il serait facile, en ne faisant que des raisonnements trés simples, de 
généraliser ce résultat. Nous allons établir un théoréme plus général, qui 
comprend toutes ces généralisations presque évidentes. 


2°. Considérons une suite de valeurs de comprises 
entre zéro et un, et formant un ensemble partout dense dans l’intervalle (0,1). 
Les n—1 premiers nombres fy définissent une division de cet intervalle en 1 
intervalles partiels, a laquelle nous associerons comme tout 4 ’heure la somme 
non aléatoire by, et la somme aléatoire B,: bn, au plus égal au plus grand des 


intervalles partiels, tend vers zéro pour n infini. 
THEOREME 5. Ona 


Pr{ lim B, = 1} —1. 


Pour le démontrer, observons que, quand n augmente d’une unité, la 
variation de la somme b» provient d’un seul terme, que nous désignerons par 
Tn”, qui se trouve remplacé par une somme ty? + = tn? — On 
en déduit 


(28) by — Tr» 
et, pour B,, on a de méme 


n et &,” étant les accroissements de 1 (¢) dans deux intervalles contigus, de 
longueurs respectives tn’ et tn”. Comme ils sont indépendants, on a 
Proposons nous maintenant d’étudier l’oscillation de B, quand n varie 


dans un intervalle (p,q); nous poserons 


Max | B,— B, |. 
Nous considérerons d’abord des probabilités, que nous désignerons par 
des lettres accentuées (P’ ou €’), évaluées en supposant connus les termes 
de Bg, c’est-a-dire que l’on connait les valeurs absolues des accroissements AY 
dont les carrés interviennent dans 2,, mais non leurs signes. D’aprés (29), 


les déterminations successives de By-1, +, Bp dépendent des signes de 
pour les valeurs g den. Ces signes sont indépen- 


En’ | et | én” | étant connus, les deux signes 


dants; pour tout  inférieur a q, 
possibles pour €n’én” sont également probables; le choix d’un signe détermine 
| én | =| én’ + &x” |, et Von se retrouve dans les mémes conditions pour 
déterminer | én, | par un nouveau groupement de termes. 

Nous sommes ainsi dans les conditions voulues pour appliquer une 


d 
| 
uf 
| 
| 
jj 
fi 
i 
q 
an 


LE MOUVEMENT BROWNIEN PLAN. 511 


inégalité connue de A. Kolmogoroff, ou du moins son extension au cas de 
certaines sommes de variables enchainées que nous avons indiquée antérieure- 
ment (Var. aléatoires, pp. 246-247), et il vient 


Or Pr et € sont respectivement les valeurs probables de P1’ et €’. Compte 
tenu des formules (28) et (30), il vient 


2 2 
(31) Pr{Dpq = SF (by—ba) < 
Posons maintenant 


T, = lim T = lim 7’5. 
p00 
Ces limites, finies ou infinies, existent presque sirement, a cause du carac- 
tere monotone de T'pq, et, bp tendant vers zéro pour p infini, on déduit de 
Pinégalité (31) 


€ 
Comme cela est vrai quelque petit que soit ¢, il est presque sir que 7’ = 0, 
cest-i-dire que B, a une limite 2B. Comme enfin il y a convergence en 
probabilité vers ’unité, on a B 1, c.q. f. d. 

Naturellement, comme dans tous les énoncés de cette nature, il y a con- 
vergence uniforme de la suite des By, sauf dans des cas de probabilité inférieure 
4un nombre arbitrairement petit 7. On peut en effet, d’aprés (32), déterminer 


pr (pour h = 1, 2,---) de maniére que 
1 On 
Pr{T >, = } S 2h?bp, < 
h ah? 
et par suite 


Comme, dans les cas de convergence vers l’unité, n’ =n entraine |By-—1| 
le résultat énoncé est bien établi. 

On remarque aussi que la convergence obtenue est indépendante du choix 
des ty, si ’on assujettit ce choix 4 la seule condition que en (done aussi bn) 
soit borné supérieurement par une fonction donnée de n qui tende vers zéro 


pour n infini. 


3°. Introduisons maintenant le hasard dans le choix des tn. Nous sup- 
poserons ces nombres choisis successivement d’aprés des lois qui peuvent n’étre 


512 M. PAUL LEVY. 


pas indépendantes les unes des autres; mais, pour chaque n, la loi A n variables 
ti, lo,°**, tn est bien déterminée et indépendante des expériences qui détermi- 
nent X(t) ; de plus ces lois doivent étre telles que <n tende presque stirement 
vers zéro. Tel sera le cas si, par exemple, pour n’importe quel intervalle At, la 
probabilité que ¢, soit dans cet intervalle a une borne inférieure indépendante 
de ts," +, et qui soit le terme général d’une série divergente. 

La suite des B, dépend alors de deux séries d’expériences indépendantes 
Pune de Vautre. La premiére a pour objet de déterminer la suite des tn, et 
Yon sait (cf. Var. aléatoires, p. 22) qu’il est en tout cas possible de faire 
correspondre les différentes suites possibles aux différentes valeurs d’une 
variable 7’ comprise entre zéro et un, et cela de maniére que la probabilité de 
w’importe quel ensemble de suites possibles soit égale 4 la mesure de l’ensemble 
des valeurs de 7’ qui leur correspondent (si cet ensemble n’est pas mesurable, 
la probabilité est indéterminée entre une probabilité intérieure et une proba- 
bilité extérieure, égales respectivement 4 la mesure intérieure et 4 la mesure 
extérieure de cet ensemble) ; chacun des ¢, est une fonction mesurable de 7. 
1’une maniére analogue, on peut représenter par une variable unique U 
ensemble des choix qui déterminent successivement X(1), X(4), X(4), 
X ($),° °°, et par suite toutes les fonctions Y,(¢) du théoréme 1; X(t) est 
la limite presque stire de .\,(/), la convergence étant uniforme (en / et U) 
en dehors d’un ensemble de valeurs de U de mesure arbitrairement petite. 

Désignons par EF Vensemble des points T, U, du carré 0=T=1, 
0 SU =1, pour lesquels on ait lim B,=—1; par KL’ Vensemble comple- 


mentaire. Nous allons montrer que 
THEOREME 6. L’ensemble EK’ est mesurable et de mesure nulle. 


Si ’on admet que FH est mesurable, la démonstration est immédiate: il est 
presque stir que lime, 0, et que cela entraine lim B,—1. En d’autres 
termes, sauf pour des valeurs de ¢ constituant un ensemble de mesure nulle, 
ensemble des points de H’ situé sur la droite J —t a une mesure linéaire 
nulle. Done EH’ a une mesure superficielle nulle. 

Il reste 4 montrer que / est mesurable. On sait que la probabilité de la 
convergence d’une suite de variables aléatoires, et celle de sa convergence 
vers une limite donnée, sont toujours bien déterminées. Pour montrer que ce 
théoréme est ici applicable, il faut montrer, non seulement que chaque inégalité 
Bn <8 a une probabilité déterminée, c’est-d-dire que Bn est une fonction 
mesurable du point 7’, U, mais qu’il en est de méme de toute combinaison en 
nombre fini d’inégalités de cette forme. Ce second résultat est d’ailleurs une 
conséquence évidente, non du premier résultat considéré isolément, mais de ce 
résultat et du fait qu’une méme représentation du résultat des experiences sur 


{ 

q 

q 

“a 


LE MOUVEMENT BROWNIEN PLAN. 513 


le plan des 7’, U permet d’étudier toutes les fonctions Bn; on sait en effet que, 
dans ce plan, la partie commune 4 plusieurs ensembles mesurables est un 
ensemble mesurable. On est donc ramené a démontrer que chaque fonction By, 
est une fonction mesurable des point 7, U. 

Nous démontrerons un résultat plus général, qui aura plus loin une autre 


application: si $(21,%2,°-*-+,&n) est une fonction continue de l'ensemble de 
ses n arguments, lexpression 
(33) = X(te),- X(tn)] 


est une fonction mesurable du point T, U. 

La démonstration est immédiate, en utilisant la définition de \(¢) comme 
limite des approximations X,(¢). En remplagant X(t) par Yv(t), ® se trouve 
remplacé par une expression ®, qui est une fonction continue de #,, t2,- ++, tn; 
et des 2” quantités X(h/2”) qui interviennent dans la détermination de 
\y(t). C’est une fonction continue d’un nombre fini de fonctions mesurables 
de T ou de U, done du point T, U; c’est donc une fonction mesurable de ce 
point. 

I] suffit donc de montrer que ®) tend en mesure vers ®, pour v infini; 
est-a-dire que, « et ¢ étant arbitrairement petits, on peut déterminer v’ tel 
que, pour v > v’, on ait 

Pr{| | <e. 


Or on peut d’abord déterminer M tel que 
Pr{Max | X(t)| > M} < ; 


Jes nombres X(t,), X(t2),- (tn), Xv(t1), Xv(te),* +, Xv(tn) etant 
ainsi bornés (en dehors de cas de probabilité inférieure 4 «/2), on n’a a con- 
sidérer qu’une région ott la fonction est uniformément 
continue: si done chacun des X (tn) — Xv(ta) (kh = 1, 2,- ne dépasse 
pas en valeur absolue un certain module de continuité »—7(e’), on a 
|@6—¢@,|<-¢. Or nous avons vu qu’en négligeant des cas de probabilité 
inférieure 4 un nombre arbitrairement petit (nous prendrons ici «/2), 


| X(t) —Xv(t)| peut, pour tout ¢ entre zéro et un et tous les cas non négligés, 
étre rendu inférieur 4 un nombre arbitrairement petit (ici y) ; il suffit que v 
soit assez grand. Dans ces conditions on a bien | 6—4,|<, sauf dans les 
cas négligés dont la probabilité totale est inférieure a «/2 + «/2 =e, c.q. f.d. 


CorotiairE. La partie By(t) de la somme Bn qui dépend des valeurs 
de X(u) dans Vintervalle (0,t) tend presque siirement, pour n infin, vers 
B(t) =¢ et cela uniformément quand t varie de zéro a un. 


514 M. PAUL LEVY. 


Le théoréme précédent s’applique évidemment pour chaque valeur de / 
comme pour la valeur un.*° En considérant alors un ensemble dénombrable 
de valeurs ¢’y de ¢, partout dense entre zéro et un, il y a convergence presque 
sire de chacun des B,(t’”) vers B(t’v); en effet, pour chaque ?¢’y, il n’y a 
divergence que dans des cas dont la probabilité est nulle; la réunion de tous 
ces cas a encore une probabilité nulle. En dehors de ces cas, il y a en tous les 
points ¢’y convergence de la fonction monotone B,(t) vers la limite B(t) 
continue, donc uniformément continue, dans l’intervalle fermé (0,1). On 
sait qu’il y a alors convergence uniforme dans tout l’intervalle, c. q. f. d. 

4°, Arrivons maintenant a une application du théoréme de Fubini, qui 
est fondamentale. Désignons par F l’ensemble du plan des 7, U, intérieur 
au carré 0 [71,05 U et correspondant a l’ensemble des cas ov il 
y a convergence uniforme de B,(/) vers ¢ dans l’intervalle (0,1). On peut 
indifféremment calculer sa mesure en intégrant par rapport 4 7 la mesure 
linéaire de sa section par une droite U’ ==const., ou en faisant l’inverse. C’est 
par la premiére méthode de calcul que nous avons déterminé cette mesure, et 
montré que le complément de F est de mesure nulle. L/’interversion de l’ordre 
des intégrations nous donne immédiatement un résultat important. Pour 
’énoncer simplement, nous dirons qu'une fonction X(t) est un modeéle de 
mouvement brownen linéaire si, la suite des ty élant choisie au hasard, on 
obtient avec une probabililté unilé une suite de fonctions, By(t) ayant une 
limite non aléatotre B(t), fonction continue et croissante de t; pour chaque 
intervalle Af, la variation AB(?t) sera la mesure de Voscillation brownienne. 

La conséquence annoncée du fait que l’ensemble F ait pour mesure l’unité 


sénonce alors ainsi: 


THEOREME 7. Le schéma stochastique du mouvement brownien linéaire 
réalise avec une probabilité unilé un modéle de mouvement brownien linéaire; 


de plus B(t) =t. 


Quelques remarques sont nécessaires pour bien comprendre la définition 
qui précéde. Il est d’abord évident que, pour une fonction donnée X (1), il 
peut arriver que B,(t) ait une limite presque stire autre que ¢. Cette limite 
est. nécessairement une fonction non décroissante de ¢. Si elle est constante 
dans un intervalle, c’est que la fonction X(¢) n’y est pas assez irréguliére pour 
pouvoir donner une idée du mouvement brownien. II] est peut-étre aussi 
possible, si V(t) a au voisinage d’un point une allure trop irréguliére, que 


2°Qn remarque d’ailleurs qu’en raison de l’indépendance des oscillations de X(u) 
dans les deux intervalles (0,¢) et (¢,1), il ne peut avoir. convergence presque stire 
de B, = B,,(1) vers une limite que si B,(¢t) et B, —B,(t) ont séparément des limites 


presque sures. 


iH 
id 
i 
{ 
{ 
i 
} 
We 


LE MOUVEMENT BROWNIEN PLAN. 515 


B(t) y soit discontinu. C’est pour cela que nous avons supposé la fonction 
B(t) continue et croissante, et il est évident que dans ce cas il n’y a qu’a 
prendre cette fonction comme nouveau paramétre pour étre ramené au cas ot 
B(t) 

I] faut alors prendre garde que ce changement de paramétre modifie la 
loi de probabilité dont dépend le choix des tn; c’est avee la loi de probabilité 
ainsi transformée que l’on pourra considérer la nouvelle fonction X(t) comme 
un modéle de mouvement brownien linéaire pour lequel on ait B(t) =. 

I] peut étre utile de préciser la loi dont dépend le choix des t, de maniére 
que la définition de ce que nous appelons un modéle ne dépende d’aucun 
élément arbitraire. Le plus simple est de supposer que chaque ¢, soit choisi 
indépendamment des autres, et avec une probabilité uniformément répartie de 
zero 2un. On peut montrer que: la notion de modéle de mouvement brownien 
linéaire ainsi oblenue nest pas changée si Von remplace cette loi de répartition 
uniforme par une autre lot absolument conlinue pour laquelle la densité de 
probabilité soit comprise entre deux nombres positifs. 

Nous n’indiquerons que le principe de la démonstration. Méme si l’on 
suppose seulement que B,(1) tend presque sfiirement vers B(1), il en résulte 
que, pour tout ¢ compris entre 0 et 1, B,(¢) a presque sfirement une limite 
B(t); autrement les oscillations de B,(t), qui dépendraient du choix des 
points de division entre 0 et ¢ plus que de leur fréquence, ne seraient pas 
presque stirement compensées par celles de Bn(1)-— qui dépendent 
des points de division choisis entre ¢ et 1. 

Il en résulte évidemment que l’on peut augmenter dans un rapport 
déterminé la probabilité d’un des intervalles (0,¢) et (¢,1) et diminuer en 
conséquence celle de Vautre; cela ne peut pas empécher que Bn(t) et 
B,(1) — B,(t) tendent respectivement, et presque sfirement, vers B(t) et 
B(1) — B(t); done B,(1) vers B(1). 

On peut raisonner de la méme maniére pour m’importe quelle division 
de intervalle (0,1) en intervalles partiels, et un passage a la limite facile 
conduit au résultat énoncé. 

Par suite, méme si l’on précise la définition de modéle de mouvement de 
brownien linéaire par la condition que pour le choix de chaque tn la probabilité 
soit répartie d’une maniére uniforme, si l’on trouve pour B,(t) une limite 
presque stire 7’ = B(t), pourvu que tous les rapports A7’/At soient compris 
entre deux nombres positifs, le changement de variable qui consiste a prendre 
T comme nouvelle variable est légitime. On est ainsi ramené au cas ou 


B(t) = t. 


5°. La convergence de B,=Bn(1) vers B-= B(1) est bien entendu 
presque stire, mais non sire. Désignons par Bn la borne inférieure de Bn, et 


516 M. PAUL LEVY. 


par B, sa borne supérieure, quand on fait varier les points de division. On a, 
au sujet de ces nombres, les résultats suivants: 


THEOREME 8. Pour n’importe quel modéle de mouvement brownien, B, 
tend vers zéro, pour n infin. 


THEOREME 9. Pour la fonction aléatoire X(t) du schéma du mouvement 
brownien linéaire, il est presque stir que Bn augmente indéfiniment avec n. 


Le premier de ces théorémes résulte de ce qu’un modéle de mouvement 
brownien linéaire est nécessairement une fonction continue X(t). On peut 
alors, si m, est assez grand, définir entre X(0) et X(1) une suite de nombres 
croissants 2, telle que la somme des carrés des intervalles ainsi 
séparés soit arbitrairement petite, puis définir entre zéro et un des nombres 
croissants ¢,,t2,- - -,tn-1 tels que X(tv) =a (v—1,2,---,n—1). On 
obtient ainsi pour By une valeur arbitrairement petite, c. q. f. d. 

On voit méme aisément qu’on peut prendre pour les ft, n’importe quel 
ensemble dénombrable et partout dense entre zéro et un, donné d’avance; il 
suffit de les ranger dans un ordre convenable pour que B,, tende vers zéro (ou 
vers n’importe quelle valeur donnée entre zéro et B). 

Pour démontrer le théoréme 9, observons que, si les points de division 
sont assez nombreux, les valeurs des accroissements AX se répartissant suivant 
leur probabilité théorique, on aura avec une probabilité supérieure 4 1 —«/2 
des intervalles de longueur totale supérieure 4 pour lesquelles 
(AX)? > 2cAt (ec étant arbitrairement grand, arbitrairement petit, et 
déterminé en fonction de c). Conservant ces intervalles, et subdivisant les 
autres, on arriver a de nouveau a trouver une fraction supérieure a k de la 
longueur de chacun d’eux pour laquelle on aura, pour les nouveaux intervalles 
obtenus, (AA)? > 2cAt, et cela en exceptant des cas de probabilité totale 
inférieure 4 ¢/4. Prenons alors pour p un entier tel que (1—k)?’ < 4. Apres 
p opérations analogues, sauf dans des cas de probabilité inférieure a 


on aura obtenu une division de l’intervalle (0,1) en intervalles partiels pour 
laquelle plus de la moitié de la longueur totale sera constituée par des inter- 
valles partiels tels que (AX)? > 2cAt; done 3(AX)*? > c, ce qui démontre le 
théoréme 9. 

Le résultat ainsi obtenu pour la fonction aléatoire X(¢) n’est pas, comme 
dans le cas du théoréme 8, applicable 4 tous les modéles de mouvement 
brownien linéaire. On peut définir de tels modéles pour lesquels on a toujours 
(AX)? S cAt, done B, Sc, c étant une constante suffisamment grande. 


os 
| 
ig 
i 
| 
an 
HE 


LE MOUVEMENT BROWNIEN PLAN. 517 


Prenons maintenant pour n une fonction lentement croissante n(h) d’un 
entier 4 (par exemple la partie entiére de log log log h), et supposons qu’a 
chaque valeur de h on fasse correspondre n —1 points de division choisis au 
hasard, d’ow résultera une valeur de Brn) = B’n. Le grand nombre d’expéri- 
ences ainsi faites pour une méme valeur de n conduira 4 trouver, pour Bn, 
des nombres remplissant V’intervalle (Bn, Bn), et, si n(h) croit assez lentement, 
la suite des B’, aura presque stirement pour valeurs limites tous les nombres 
de l’intervalle (0, 8), B étant la limite de Bus 

D’aprés cette remarque, méme s’il est possible de généraliser le résultat 
obtenu au sujet de la convergence presque sire de By vers B, on ne peut pas 
Yappliquer sans aucune restriction relative au choix des modes de division de 
Vintervalle (0,1) successivement considérés. Nous ne savons pas, notamment, 
vil serait suffisant que le nombre des points de division soit constamment 
croissant pour que la convergence de B, vers B soit presque stire. 


6°. L’existence de fonctions qui soient des modéles de mouvement 
brownien linéaire n’était pas évidente a priori. Au point de vue idéaliste, elle 
résulte du théoréme 7. Mais ce théoréme ne nous donne aucun moyen de 
nommer une telle fonction; c’est ce que nous allons faire maintenant. 

Pour cela nous nous inspirerons de ce que fait le hasard ; nous chercherons 
4 Pimiter. M. Borel a montré, qu’on ne peut pas, d’une maniére générale, 
imiter le hasard ; si l’on imite certains caractéres d’une suite de nombres choisis 
au hasard, on en omet nécessairement d’autres. Mais si l’on porte son atten- 
tion sur certaines conditions bien déterminées (ici celles qui interviennent dans 
la démonstration du théoréme 7), on peut, 4 ce point de vue spécial, imiter le 
hasard. 

Nous prendrons d’abord, comme modéle d’une suite de nombres 
@1,%2,° * *,4n,° °°, choisis au hasard entre zéro et un, la suite des parties 
fractionnaires des nombres n/logn. Elle présente ce caractére que, pourvu 
que (n’—n)/log n augmente indéfiniment avec n, les ay d’indices compris 
entre n et n’ se répartissent uniformément entre zéro et un, la fréquence de 
ceux qui sont compris entre zro et n’importe quel nombre z compris entre 
zero et un tendant nécessairement vers vx. On imite le hasard en ce qui con- 
cerne l’uniformité de sa répartition ; mais on ne l’imite pas dans ses caprices ; 
une suite de nombres effectivement choisis au hasard ne serait pas constituée 
par des suites partielles de nombres croissant réguli¢rement de 0 a 1. 

L’uniformité de la répartition serait encore mieux réalisée si l’on prenait 
m™ au lieu de n/logn (il suffirait que n’—n augmente indéfiniment pour 
obtenir une répartition uniforme des termes d’indices compris entre n et n’). 
Mais il y aurait entre 2» et ton (qui serait la partie fractionnaire de 2am) une 


| a, 
Bn 
ont 
nt 
ut 
si 
On 
el 
il 
ou 
on 
nt 
2 
es 
k 
eg 
la 
es 
le 
és 
T- 
le 
e 
t 


518 M. PAUL LEVY. 


corrélation que nous évitons en prenant pour 2, la partie fractionnaire de 
n/log n; il n’y a alors aucune corrélation entre 2» et tn, ni, plus généralement, 
entre Zp, et Zn, pour n’ = 2°n, 

Posons alors % =F (én), F(é) désignant la fonction de répartition de 
la loi de Gauss ; la suite des €, sera un modéle de suite de variables gaussiennes 
choisies au hasard. Or, d’aprés le § 1, la détermination successive des nombres 


X(1), X(3), X(4), X(4), V(4), 


qui aboutit 4 la détermination de Y(t), dépend de variables gaussiennes 
choisies successivement. I] suffit de choisir la suite des &, qui vient d’étre 
définie pour obtenir un modéle de mouvement brownien linéaire; nous nous 
contenterons d’indiquer ce résultat sans démontration. 


7°. L’extension des résultats précédents au cas du mouvement brownien 
plan est immédiate. Nous désignerons ici par Al la longueur de la corde 
A(t)A(t-+ At), et ferons correspondre 4 chaque ligne polygonale 4 n cétés 
inscrite dans l’arc A(0)A(1) la somme 


(34) By (Al)? = 3[ (AX)? + (AY)*]. 


Les résultats obtenus pour le mouvement brownien linéaire s’appliquant 
séparément a X(t) et Y(¢), on voit quw’ici By tend vers 2- [ou vers 2¢ si l’on 
considére are A(0)A(t)] sous les conditions qui, dans le cas du mouvement 
linéaire, assurent la convergence vers 1 (ou vers ¢). La limite obtenue pourra 
toujours étre appelée mesure de Voscillation browntenne (plane). 

On peut aussi se proposer de définir des modéles de mouvement brownien 
plan. 11 faut (’abord supposer qu’il y ait, pour chacune des sommes %(A1)* 
et S(AY)? relatives 4 chaque intervalle (0,/), convergence presque stiire vers 1. 
Mais cette condition est trop peu restrictive; elle pourrait étre réalisce en 
prenant Y(t) = X(t). Le mouvement serait rectiligne, et donnerait une bien 
mauvaise idée du mouvement brownien plan. I] est alors indiqué d’ajouter 
une condition d’indépendance de X(t) et Y(t); ce sera que SAXAY tende 
vers zéro, dans les conditions indiquées 4 propos de la convergence de ¥(AX)° 
vers ’unité [ou vers ¢, s’il s’agit de arc A(0)A(t)]. Cette condition, équiva- 
lant 4 celle que la mesure de l’agitation brownienne linéaire soit toujours bien 
définie et ait la méme valeur en projection sur n’importe quelle droite du plan. 
est évidemment réalisée, avec une probabilité unité, dans le schéma stochastique 
du mouvement brownien plan. 

L’extension du théoréme 9 au mouvement brownien plan est évidente. Il 
n’en est pas de méme du théoréme 8. La somme (34) n’est en effet trés petite 
que si les deux termes (AX)? et 3(AY)? sont tous les deux trés petits. Or, 


| 
at} 
Bs ; 
| 
ae 


8 


IS 


LE MOUVEMENT BROWNIEN PLAN. 519 


si chacun d’eux peut étre rendu trés petit, il n’est pas évident qu’ils peuvent 
étre rendus simultanément trés petits. Nous ne ferons que signaler ici cette 


question. 


5. L’aire limitée par la courbe C. 1°. Etudiant maintenant le mouve- 
ment plan, nous désignerons par S(t) V’aire comprise entre l’are A(0) A(t) 
et sa corde, les conventions de signes étant celles que l’on fait pour représenter 
une aire par une intégrale curviligne étendue a son contour; nous écrirons 8 
au lieu de S(1). 

L’intégrale qui représente Vaire ne sera pas une intégrale au sens de 
tiemann, mais une intégrale stochastique, analogue a certains points de vue 
a ’intégrale B du § 4. Elle pourra étre définie, non comme limite sire d’une 
somme, mais comme limite en probabilité, ou en moyenne quadratique, ou 
encore comme limite presque stire. 

Comme pour |’étude de Bn, nous allons commencer par un cas simple en 
considérant les lignes polygonales L’n, inscrites dans are A(0)A(1), ayant 
chacune pour sommets les points de cotes multiples 2-". Désignons par S’n 
aire comprise entre L’n et L’nii. Elle est la somme de 2” triangles, dont les 
aires sont des variables aléatoires indépendantes les unes; nous préciserons 
plus loin la loi dont elles dépendent; il suffit d’observer ici qu’elles ont une 


valeur probable nulle, celle de leurs carrés étant 1/2°"*, On en déduit 
= 9, E{ = 1/2". 


La série &’,, qui représente S, est donc convergente en moyenne qua- 
dratique. Quoique ces termes ne soient pas indépendants, il est facile d’établir 
sa convergence presque siire. On peut, par exemple, utiliser linégalité de 


Tchebycheff, qui donne 
Pr{| S’n | 1 < 1 /2*/2, 


Cette probabilité étant le terma d’une série convergente, il existe presque 


sirement un nombre N tel que, pour n > N, on ait 


<= 1 


ce qui établit le résultat annonce. 


2°. Considérons maintenant, comme au 2° du §4, une suite de nom- 
bres compris entre zéro et un et formant un ensemble 
partout dense dans cet intervalle. Nous désignerons par In la ligne brisée 
allant de A(0) 4 A(1) et ayant comme sommets intermédiaires les points 
A,, ++, fen écrivant Ax au lieu de A(ta) ], rangés dans l’ordre des ¢ 
croissants. Nous désignerons par S, l’aire comprise entre Ln et la corde 


e 
it, 
le 

e 
t 


520 M. PAUL LEVY. 


A(0)A(1), et par T, la différence Sn. — Sn, qui est aire d’un triangle ayant 
pour sommet A, et pour base un cdté An’An” de In. Nous désignerons sa 
longueur par Jy, par tn’ et tn” les cotes de A,’ et An”, et poserons 


” , ” ” 
Tr = ty —tn, =tn— tn, Tn = ty — ty. 


Si €,_, désigne une valeur probable calculée en connaissant A,, A2,**-. An, 


on a 
Tn 
(35) n Tn” 
E{Tn?} = = 43 (dn - 


by» ayant la méme signification qu’au §4, de sorte que la formule (28) est 
toujours applicable. D’aprés la premiére de ces formules, on peut appliquer 
Vinégalité de A. Kolmogoroff 4 la somme 37>. II vient ainsi 


1 
Pr{ Max | Snin — n | = 4 V bn — Dnip} = oe 
0o<h=p 2 Cc 
et, en faisant augmenter p indéfiniment 
(36) Pr{ Max | Sy — 8. | Ge 
v>n 


Comme 6, tend vers zéro pour n infini, « et e’ peuvent étre rendus simultané- 
ment arbitrairement petits. la convergence presque stire de la suite des S, 
en résulte.** 


3°. Montrons maintenant que, si l’on remplace la suite des ¢, par une 
autre suite analogue /,, f2,---,én,°-~*, les deux expressions § et S successive- 
ment obtenues pour l’aire étudiée sont presque sfirement les mémes. II sullit 
évidemment de montrer que les aires polygonales S, et S, dont elles sont les 
limites sont infiniment peu différentes en probabilité, et pour cela de montrer 
qu’elles sont l’une et l’autre infiniment peu différentes en probabilité de l’aire 
Sn” limitée par la ligne polygonale inscrite dans C ayant pour sommets tous 
les points de cotes +, tna, Pour Sn, par exemple, cela 
résulte de la formule (36), qui s’applique évidemment a tout mode de sub- 
division de l’are A(0)A(1) commengant par les points de cotes t2,° 

On peut aussi, en utilisant les formules (35), montrer que les moyennes 
quadratiques de S, —S,” et S, —S,” sont infiniment petites. 


11 Au point de vue de l’uniformité de la convergence par rapport au choix de Ia 
suite des ¢,,, il faut noter que ¢ et e’ ne dépendent que de b,, lui-méme borné supérieure- 
ment par [formule (27) ]. 


{ 

i 

i 

4 ‘ 
4 

4 

a i 

i 

if 

if 

ig 


LE MOUVEMENT BROWNIEN PLAN. 521 


4°, Introduisons maintenant le hasard dans le choix des tn. Les 
raisonnements étant identiques 4 ceux faits 4 propos des Bn, nous ne ferons 
qu’en rappeler les grandes lignes. C’est pour n’avoir pas a les recommencer 
que nous avons introduit 4 propos de |’étude de B, l’expression générale (33), 
dont By n’était qu’une forme particuliére; il faut seulement noter que Sn 
dépend des deux fonctions aléatoires XY (¢) et Y(t) ; mais cela ne change rien 
au raisonnement fait 4 propos de l’expression (33), et le résultat obtenu 4 cet 
endroit s’applique 4 Sn. Nous n’avons donc pas a craindre que nos raisonne- 
ments introduisent des ensembles non mesurables ou des probabilités non 
déterminées. La probabilité de la convergence de S» vers S est bien déterminée, 
et il importe peu, pour la calculer, qu’on fasse d’abord les expériences qui 
déterminent les tn, puis celles qui déterminent C’,, ou l’inverse. Or nous savons 
que, pourvu qu’il soit presque sfir que la suite des ¢, est partout dense, il est 
presque sir que S,, tend vers S. En disant que, pour une courbe C déterminée, 
aire S est stochastiquement définie si, les points tn étant choisis au hasard, 
8, tend presque sfirement vers une limite non aléatoire S, nous voyons que: 


THEOREME 10. Le schéma aléatoire du mouvement brownien plan con- 
duit, avec une probabilité unité, ad une courbe C pour laquelle Vaire S est 
stochastiquement bien définie.” 


5°, La question se pose naturellement de déterminer la loi dont dépend 
8. Nous traiterons d’abord un probléme plus élémentaire: délerminer la loi 
dont dépend Vaire d’un triangle inscrit dans C, ses sommets ayant des cotes 
données. 

Nous désignerons ces cotes par t—7’,t,¢-+ 7”. Laire est évidemment 
de la forme 4/77” T, la nature de la variable aléatoire 7 étant indépendante 
de t, 7’ et 7”. Si A désigne la longueur d’un vecteur gaussien réduit, et si 7 
est une variable gaussienne réduite, la longueur du cdté A(t—v7’)A(t) du 
triangle étudié est de la forme AV 7’, et celle de la hauteur perpendiculaire A 
ce est 7”; A et » sont indépendants, et 7’ = Ay. 

Calculons les moments de la variable aléatoire 7. On a évidemment 


Boy = E{d??}E{ = + 1)-1-3-5--- (2p—1) = (2p)!, 


* Bien entendu, l’aire S n’est définie que stochastiquement. On peut développer a 
ce sujet des remarques analogues & celles qui nous ont conduit au théoréme 9. Le 
résultat est que O a presque sfirement la propriété suivante: on peut définir la suite 
des t, de maniére que S,, ne tende pas vers S, et méme de maniére que S,, ait n’importe 
quelle limite donnée, finie ou infinie. 


~ 


a 


522 M. PAUL LEVY. 


et par suite 


(38) $(z) = 


1+ 2 


Le fait qu’on trouve pour ¢(z) une série entiére 4 rayon de convergence 


<1). 


positif suffit, comme on sait, pour étre assuré que la fonction caractéristique 
de la loi étudiée est bien celle définie sur tout l’axe réel par le prolongement 
analytique de cette fonction. II s’agit donc de la premiére loi de Laplace, 
c’est-d-dire de la loi symétrique définie par 


(39) (x>0). 


6°. Etudions maintenant la loi dont dépend S(1) ; a cause de la simili- 
tude stochastique entre la courbe C et ses parties, S(t) /t dépend de la méme 
loi. Cette variable aléatoire étant en corrélation avec la longueur L(t) =AVi 
de la corde A(0)A(t), nous étudierons la fonction de répartition a deux 
variables, évidemment indépendante de ¢ 


F(a, p) = Pr{8(t) < at, L(t) < pV t}. 


Nous commencerons par admettre que les trois premiéres dérivées de cette 
fonction sont définies et continues, sauf peut-étre pour « 0; cette hypothése 
sera justifiée plus loin. Il résulte d’autre part de la maniére dont S a été défini 
comme somme d’une série qui converge en moyenne quadratique (§ 5, 1°), 
que ses deux premiers moments sont finis. 

Nous désignerons par EVdt et »Vdt les deux composantes de 
A(t)A(t-+ dt) suivant la direction A(0) A(t) et la direction perpendiculaire; 
é et » sont deux variables gaussiennes réduites, indépendantes l’une de l’autre, 
et indépendantes de l’are A(0)A(¢), et par suite de S(t) et L(t). De 


(L + 8L)? = (L + éVdt)? + 
[en écrivant LZ au lieu de L(t)], on déduit 
(40) =éVdt + dt + o(dt), 
tandis que la variation de § = S(t) est évidemment 
(41) 88 = = Vdt + 8,(dt), 


S,(dt) dépendant de la méme loi que S(dt), done que déS(1) ; cette aire a 
sa valeur probable nulle, et est stochastiquement indépendante de Vare 
A(0)A(t), done de S(t) et L(t), mais non de € et 7. 


il 
a 
; 
a 
a 
Bit 
7 
if 
( 
i 
if 
frit 


LE MOUVEMENT BROWNIEN PLAN. 523 


Pour former une équation vérifiée par la fonction F(a, p), nous allons 
calculer de deux maniéres différentes la probabilité 


(42) P = Pr{S +88 < at, + 8L < pvt}. 


Une premiére évaluation de P repose sur la remarque que, I’(%, p) ne 
dépendant pas du temps, on a 


Pr{S + 8S < a, (t+ dt), L+8L Vt + dt} = F(a, p,). 
En définissant alors a, et p; par les formules 


a(t-+dt)=at, pVt+dt—pvt, 


l’on tire 


dt dt 
il vient 
(43) =F (a1, px) F(a, p) Pra —p + 0( dt). 


La seconde maniére d’évaleur P repose sur les expressions (40) et (41) 
de ZL et 5S, qui montrent que l’expression 


est de la forme P + o(dl). Pour évaluer P;, nous poserons 


et désignerons par P’,, au lieu de P;, une probabilité conditionnelle évaluée en 
supposant connus &, 7, et S,(dt) ; P, est sa valeur probable. On a évidemment 


P, = —? 
2)2 3/2 


K | ayant une borne supérieure, fonction de et p’ seulement. Comme 


E{n} =0, E{n?} =1, E{| 7° |} < @, 
il vient 


Py = = E{F (a, p')} + ” 29(a, + o(dt). 


i 
t 
X 
j 
P, 
| 


524 M. PAUL LEVY. 


Tenant compte d’autre part de 


1 
+ 


ot | K,| est borné par un polynome en | €é|,|7| et | S:(dt)/t| dont la 
valeur probable est finie, et oi l’expression désignée par o(dt) n’est pas 


aléatoire, il vient 
e+ ‘a2p(%, A) dA + o(dt). 
0 


Comme =P -+ o0(dt), la comparaison de cette formule et de la 
formule (43) donne 


2 
(44) + (6 *) F’p + + (a) dd = 0, 
0 


d’ou, en dérivant par rapport a p et posant F’p = G, 
1 , 1 ” 


7°. TutorEMEe 11. F(a,p) est la seule solution de Véquation (44) 
qui soit une fonction de répartition. 


L’équation (44) exprime en effet que S(t)/t et L(t)/WVt dépendent 
d’une loi 4 deux variables indépendantes de ¢. Si, pour tf, >0, on 
prenait une loi initiale quelconque, la variation de S§ et celle de L étant ensuite 
définies par les formules (40) et (41), on aurait, au lieu de F(a, p), une 
fonction F(a, p,t) vérifiant une équation analogue a l’équation (44), mais 
ou il y aurait un second membre 2¢/’;. Une solution de cette équation étant 
bien déterminée par sa valeur initiale (pour t,), l’équation (44) exprime 
bien la condition nécessaire et suffisante pour que la loi de probabilité considérée 
ne varie pas avec ¢. 

Supposons alors qu’on ait deux solutions différentes F, et F'. de ’équation 
(44), qui soient des fonctions de répartition. On peut leur faire correspondre 
deux systémes de variables aléatoires et S2, L2 qui, pour deux courbes 
C, et C. dépendant de processus convenablement déterminés pour ¢ variant de 
zéro 4 un, représentent l’aire comprise entre l’arc A(0)A(1) et sa corde, et la 
longueur de cette corde; le systéme Si, L; dépendra de la loi définie par /:; 
S2, Le de celle définie par 


dé 


d 
| 
2 
| 
te 
lo 
p 
| 
ce 
d 
L 
é 
in 
pe 
co 
eh 
(4 
et 


LE MOUVEMENT BROWNIEN PLAN. 525 


Déplagons maintenant la figure sur laquelle est tracée une de ces courbes, 
de maniére que, pour les deux courbes considérées, A(1) ait la méme position, 
et A(1)A(0) la méme orientation ; les deux origines A, et A, des deux courbes 
seront ainsi sur une méme demi-droite issue de leur extrémité commune A (1). 
Faisant ensuite varier ¢ 4 partir de la valeur 1, on prolongera C, et C2 par la 
méme courbe C’ dépendant du schéma stochastique du mouvement brownien. 
Alors le systeme S,(t)/t, L,(t)/ Vt ne cessera pas de dépendre de la loi 
définie par F,; le systéme S2(t)/t, L2(t)/Vt, dépendra de méme de celle 
définie par 

Or |Z,—L.| est borné supérieurement par la distance A,A., et 
2(S.—8,)/A,:A2 représente la distance du point A(t) a A,A,; c’est une 
variable gaussienne de paramétre Yt—1. I] en résulte qu’asymptotiquement, 
pour ¢ infini, les deux systémes de variables considérés sont confondus; en 
termes précis, les différences 

S,— S82 L, — Lp 
t 


tendent en probabilité vers zéro. Ils dépendent donc, 4 la limite, de la méme 
loi de probabilité; done F, = c. q. f. d. 


8°. Quoique les équations (44) et (45) résolvent théoriquement le 
probléme posé, nous n’avons pas pu obtenir l’expression explicite de F(a, p) ; 
peut-étre n’existe-t-il aucune expression simple de cette fonction. Dans ces 
conditions il peut étre utile d’indiquer d’autres méthodes qui permettent 
détudier la variable aléatoire S et sa correlation avec LZ. Dans ce qui suit, 
L et S ne désigneront plus L(t) et S(t), mais Z(1) et S(1). 

On peut d’abord calculer les moments de la loi & deux variables L et S.”° 
Le moment 


E{ = Ly 


étant évidemment nul si q est impair, il suffit de calculer ceux dont le second 
indice est pair. Le calcul repose sur la formule (41), formule exacte ou |’on 
peut remplacer ¢ et dt par un. En désignant par L et L, les longueurs des 
cordes A(0)A(1) et A(1)A(2), et par ¢ leur angle, qui est une variable 
choisie au hasard entre —7 et +7, cette formule prend la forme 


(46) =8+8,+4L1, sin ¢. 


18 Tous ces moments sont finis; cela résulte évidemment de 
et de ce que, comme nous le verrons plus loin, la probabilité des grandes valeurs de | S| 
décroit comme une exponentielle. 


526 M. PAUL LEVY. 


Or ¢ d’une part, L et S d’autre part, L, et S, en dernier lieu, constituent 
des groupes de variables indépendants les uns des autres; la loi A deux variables 
L et S est la méme que celle dont dépend L,, S,, et, par un changement d’unité, 
détermine celle dont dépend le systéme S(2),1(2). On a ainsi 


E€{S8?(2)} = + 8,2 + 4 L7L,? sin? $} 
et par suite 


2 
Bor = = 4, 


On détermine ensuite en calculant €{L?(2)S?(2)} a Vaide de la 
formule (46) et de 
(46’) [?(2) = L,? + L.? + 2L,L, cos ¢, 


puis /y,4, et ainsi de suite. ‘Tous les moments sont ainsi obtenus sans difficulté, 
mais successivement, le calcul dépendant chaque fois de moments antérieure- 
ment calculés. C’est donc une méthode de récurrence, et l’expression générale 


de ces moments peut étre difficile a obtenir. 


9°. Une autre méthode repose sur les remarques sur l’interpolation faites 
plus haut (§3,1°). La détermination de l’are A(0)A(1) résulte de la déter- 
mination de vecteurs gaussiens indépendants, qui définissent successivement 
A(1), puis A(4), A(4), A(#), et ainsi de suite. Désignons par Co, X,(t), 
Yo(t), So, ce que deviennent respectivement la courbe C et les grandeurs 
X(t), Y(t) et S, quand on remplace par zéro la longueur du premier de ces 
vecteurs, sans changer les autres. Si l’on a orienté l’axe des x parallélement a 


A(0)A(1), on a évidemment 
Y(t) = Y,(¢), X(t) = Xo(t) + tL (e=t= 1), 


et par suite 


(48) s—f ¥(t)dX(t) + LL [n= 


Comme S, et J; ne dépendent que de l’orientation du premier des vecteurs 
successivement choisis, et des vecteurs suivants, l’ensemble des variables So ¢! 
I, est indépendant de L. Dans ces conditions la formule (48) définit bien la 
nature de la corrélation entre L et S; elle montre notamment que le moment 
conditionnel €’{S?}, calculé dans l’hypothése Z =A, est un polynome de degré 
pen d (évidemment nul si p est impair, et pair si p est pair). 

Pour préciser ces renseignements, on peut chercher 4 définir les lois dont 
dépendent J, et So et la corrélation entre ces variables. Y(t) étant une somme 
de termes gaussiens de la forme nV dt, I, est de la forme 


LE MOUVEMENT BROWNIEN PLAN. 527 


h= fo 


les n étant liés par la relation 


¥(1) = f avai 


Sans cette relation, la loi 4 deux variables Y, et J, serait une loi de Gauss, 
bien déterminée par ses moments du second ordre 


Par suite, dans Vhypothése Y, =0, I, vst de la forme /2V3, m élant une 
variable gaussienne réduite.* Le produit J,D est alors de la forme 7'/2 V3, 
T dépendant de la loi définie par la formule (39). 

Pour |’étude de la loi 4 deux variables J;, So, on peut former une équation 
aux dérivées partielles analogue a V’équation (45) relative aux variables L et S. 
On peut aussi calculer ses moments. Observons seulement que, quand Y (t) 
et par suite J, sont connus, So dépend d’une loi symétrique; on a done, si p 
est impair 


(49) 0. 
On déduit ensuite de la formule (48) 
E{S*} = + 4 E{T}E{L*}, 
et par suite 


(On remarque que €{J,7}, calculé dans V’hypothése Y, = 0, a la valeur 1/12, 
et non 1/3). 


10°. La derniére des méthodes que nous voulons indiquer repose sur la 
remarque que la loi étudiée peut étre définie comme limite de celle dont dépend 
Vaire S, comprise entre une ligne polygonale Zp inscrite dans l’are A(0)A(1), 
et la corde de cet arc. Cela est bien évident, puisque S, tend en probabilité 
vers S(1). Or S, est une somme de n triangles A(0)A(t)A(¢t-+ dt) (¢ et 
t+ dt désignant les cotes de deux sommets consécutifs de Ln) ; il suffit donc 
d’étudier la loi dont dépend cette somme. 


1 
La maniére la plus simple de calculer le coefficient numérique ool est sans doute 
‘ 
de remarquer que E{[f [Y,(t) + tY,]dt]*} a la valeur % obtenue pour €{/,*} 


0 
quand Y, = Y(1) n’est pas supposé connu. 


528 M. PAUL LEVY. 


Désignons toujours par éV dl et 7 V dt les composantes de A (t) A(t + dt) 
suivant la direction A(0)A(¢) et la direction perpendiculaire, et considérons 
la loi conditionnelle dont dépend Sn, et 4 la limite S, lorsqu’on connait les é 
et les ||, et par suite toutes les valeurs successivement prises par L(t). 
L’aire S, se présente sous la forme d’une somme $3 + L(t)| | Vdd, les 
signes seuls étant indéterminés, et indépendants les uns des autres. En 
négligeant des cas trés peu probables, le plus grand de ces termes est trés petit 
par rapport 4 la somme (la vérification ne présente aucune difficulté). Il en 
résulte que la loi conditionnelle limite obtenue pour S est la loi de Gauss, 
c’est-a-dire que S est de la forme of, o désignant la valeur quadratique moyenne 
de 8, pour la loi conditionnelle ; £ est une variable gaussienne réduite; elle est 
done indépendante de o, et l’on est ramené a étudier la loi dont dépend 
expression 


2.2 1 
0 


la convergence considérée étant une convergence en probabilité. 
Or 40? est la somme des deux intégrales 


1 2 
X2(t) dt, f ¥2(t)dt, 
0 0 


indépendantes l’une de l’autre, et dépendant d’une méme loi; on est ramené 
a étudier cette loi. 

On peut appliquer 4 l’étude de la loi 4 deux variables J et X = X(1) 
des méthodes analogues a celles appliquées a l’étude de S et L. Indiquons 
seulement sans démonstration qu’en posant 


Pr{X <aJ =H(a,8), 


on obtient l’équation aux dérivées partielles 


(50) H + aH’, + (48 — 20°) H’s + 


a 


qui joue un role analogue a celui de l’équation (45), mais qui est du type 
parabolique. On peut montrer, ici encore, qu’elle détermine complétement la 
loi étudiée. 

On peut aussi calculer successivement les différents moments de la loi a 
deux variables J et X, et aussi montrer que la corrélation entre J et X est du 
second degré, c’est-d-dire que 


J = Jo + 2X + X?, 
I, et Jy étant indépendants de YX. 


p 
d 
ti 
d 
(i 
né 
ré 
y 
eff 
Ps 
A 

pa 
eff 
oO 
(5 
ha 
Te] 
co 


LE MOUVEMENT BROWNIEN PLAN. 529 


L’étude de J est ainsi analogue a celle de S, mais a certains points de vue 
plus simple ; on remarque que J est une intégrale de type classique, bien qu’elle 
dépende d’une fonction aléatoire; de plus elle ne dépend que de la seule fonc- 
tion X(t). Malgré cela la fonction H ne nous a pas plus que G paru susceptible 
d’avoir une expression simple. 

La formule obtenue 


(51) S = of (40? = J +J1,¢> 0) 


n’en est pas moins susceptible de donner des renseignements utiles sur la 
nature de la variable aléatiore 8S. Elle montre notamment que la fonction de 
répartition de S est continue et indéfiniment dérivable, sauf peut-étre pour la 
valeur zéro de la variable; la loi qu’elle définit est symétrique. Ce sont en 
effet des propriétés qui sont nécessairement vérifiées par un produit de deux 
facteurs indépendants, si elles sont vérifiées par un des facteurs (ici €). 


11°. Nous allons indiquer une autre conséquence de la formule (51): 
au point de vue de Vordre de grandeur de la probabilité des grandes valeurs de 
|8 |, la loi dont dépend S est comparable a la premiére loi de Laplace. 


Majorons d’abord la probabilité des grandes valeurs deo. Ona 


Pr{o > 8s} S Pr{ Max (J, Ji) > 2s?} S 2Pr{J > 287} 
< 2Pr{ Max | X(t)| > sV/2} S 4Pr{ Max X(t) > sV2}, 
tS1 
cest-a-dire 
Pr{o > s} S 8Pr{X > sV2}. 
Par suite, si cV2> 1, on a, pour s assez grand 
(52) Pr{o > 3} S Pr{crd > 8}, 
A désignant toujours la longueur d’un vecteur gaussien réduit. 

Or une inégalité de cette forme subsiste si l’on multiplie 4 la fois o et A 
par la variable | |, non négative, et indépendante d’elles. Elle exprime en 
effet qu’on peut établir entre o et A une corrélation telle que l’on ait toujours 
On en déduit 


(53) Pr{|S|>s}SPr{e|T|>s} (eV2>1), 


T = dX dépendant de la premiére loi de Laplace, comme nous l’avons vu plus 
haut. 

Pour obtenir au contraire une borne supérieure du premier membre, 
remarquons que dans la formule (48), lorsque Z et Y(t) (done /,) sont 
connus, X(t), et par suite So, dépendent de lois symétriques. Il y a done 


530 M. PAUL LEVY. 
une chance sur deux pour que Sy soit du signe de J,, et que par suite on ait 
1S|=4L{1,|. On en déduit 
Pr{| > 8} 24 > 8) Pr 9}, 


et par suite, pour c assez petit (2cV3 <1) et s assez grand 


(54) Pr{| S| (2eV3 <1). 


Il est alors probable qu’on peut déterminer une constante absolue ¢’ (comprise 
entre 1/2 et 1/2V3 ou égale a un de ces nombres) — que la formule 
(53) s’applique pour c > c’ et la formule (54) pour ¢c <c¢ 


12°. Nous allons maintenant établir le résultat annoncé plus haut con- 
cernant l’existence et la continuité des dérivées de F(a, p). Nous n’avons pu 
y arriver que par utilisation simultanée des différentes méthodes d’étude de 
cette fonction exposée ci dessus ; mais nous ne serions pas surpris qu’il existe 
une démonstration plus simple. 

Utilisons d’abord la formule (48) ot, quand Y(t) et par suite J; sont 
connus, S» dépend évidemment d’une loi continue [et méme a fonction de 
répartition indéfiniment dérivable; on le voit ais¢ément en, étudiant l’influence 
d’un des paramétres indépendants qui définissent X(t), par exemple X,(4) |. 
Il en résulte qu’il ne peut pas y avoir de relation linéaire entre So et J; qui 
ait une probabilité positive d’étre réalisée. La probabilité conditionelle de 
S < a, quand J a une valeur connue p, est donc une fonction continue de « et p; 
comme on obtient évidemment F'(,p) en multipliant cette probabilité con- 
ditionnelle par pe*/*dp, qui est la probabilité de Vintervalle dp, et en 
intégrant par rapport 4 p, il en résulte que la dérivée G = F’p existe, et est 
une fonction continue de p; au facteur pe~*’/* prés, elle représente la probabilité 
conditionnelle de S < a, dans Vhypothése L = p. 

Utilisons maintenant la formule (51). L’hypothése L =p ne modifie 
pas le fait que, dans cette formule, ¢ soit une variable gaussienne indépendante 
de o. La probabilité conditionelle de S < a, évaluée dans cette hypothese, 
est donc aussi, comme la probabilité non conditionnelle de la méme inégalite, 
une fonction continue et indéfiniment dérivable de a, sauf peut-étre pour 


a=—=0; G, et par suite F, sont donc aussi indéfiniment dérivables par 


rapport a a. 
La formule (48), compte tenu de la remarque faite tout 4 ’heure sur !a 


loi dont dépend Sp quand J, est connu, donne aisément une autre démonstration 


de ce résultat. 
Reportons nous maintenant au raisonnement par lequel nous avons établi 


di 


] 
r 
é 
d 
8 
a 
q 
a 
se 
de 
d 
di 


LE MOUVEMENT BROWNIEN PLAN. 531 


Péquation (44) ; nous allons montrer que la continuité de F et G et de toutes 
leurs dérivées par rapport & @ sont des hypothéses suffisantes pour que ce 
raisonnement subsiste, avec quelques modifications. La dérivée F’’p2 a d’abord 
été introduite par le développement de l’expression 

dt 


é =F (a0) + + o(dt). 


Si ’on n’est pas assuré a priori de son existence, il résulte seulement de la 
comparaison des deux expressions obtenues pour P et P; = P-+ o0(dt) que 


e{ — F(a») a 


est de la forme kdt/2t + o(dt), k =k(a,p), qui ne dépend que des valeurs 
de F(a,A) pour A trés voisin de p, étant ce qu’on peut appeler une dérivée 
seconde généralisée. On peut alors écrire léquation (44), a condition de 
remplacer F’’p? par k. 

D’aprés cette équation, & est une fonction continue de @ et p. La différence 


K (a, p) = F(a, p) — k(a,)(p—A)da 


aune dérivée seconde généralisée nulle. Il en résulte qu’elle est linéaire en p. 
Si en effet il n’en était pas ainsi, on pourrait trouver une valeur po de p telle 
que la courbe représentative de la fonction K (pour & constant) soit intérieure 
4 une parabole qui la touche au point d’abscisse po. L’expression 


K (a, p) — K (4, po) — (p—~ po) K’p (4, po) 


serait done de signe constant et supérieure en valeur absolue a c(p— po)?/2, 
c étant positif; la dérivée seconde généralisée apparaitrait donc, d’aprés sa 
définition, comme une valeur moyenne d’une quantité de signe constant et 
supérieure 4 c en valeur absolue, et ne pourrait pas étre nulle, comme nous 
Vavions supposé. La fonction K (a, p) est donc linéaire en p, et admet une 
dérivée seconde F’’,2, égale a k. 

Il s’agit d’ailleurs d’un résultat général: si une fonction f(x) admet une 
dérivée seconde généralisée, définie pour chaque point « par une formule du type 


‘MN désignant wne moyenne pondérée (par rapport 4h), et les petites valeurs 
de h intervenant seules & la limite, et si f’(x) continu, c’ est wne dérivée 


seconde au sens ordinaire. 


| 


532 M. PAUL LEVY. 


Revenons a l’équation (44), dont l’exactitude est maintenant établie, 
Tous les termes autres que F’’p? étant dérivables par rapport 4 p, ce terme l’est 
aussi. L’équation (45) en résulte, et, comme elle est du type elliptique, les 
fonctions F(a,p) et G(a,p) sont continues et indéfiniment dérivables, saut 
peut-étre pour « = 0. 

Le résultat subsiste d’ailleurs pour «0. Si l’on se reporte a la formule 
(51), on peut remarquer que, si l’on remplacait o par la longueur d’un vecteur 
gaussien, le produit of dépendrait de la premiére loi de Laplace, pour laquelle 
la dérivée seconde de la fonction de répartition est discontinue 4 l’origine. 
Mais il suffit, pour écarter la possibilité d’une telle discontinuité, de montrer 
que, dans le cas qui nous occupe, les petites valeurs de o sont trés peu probables. 
Cela résulte aisément de la définition de 40°, somme de termes tous positifs; 
la probabilité qu’ils soient tous trés petits est excessivement petite (et cela 
aussi dans l’hypothése ot est supposé connu). 

Nous laisserons au lecteur de soin de préciser ce raisonnement, ce qui 
peut étre fait de deux maniéres différentes. On peut montrer que, quel que 
soit ¢ positif, on a 

Pr{o < s} =0(s°) (s>0), 


et en déduire directement le résultat annoncé pour le produit of. On peut 
aussi établir seulement le résultat annoncé pour c = 2, et en déduire la con- 
tinuité des dérivées qui, figurent dans les équations (44) et (45). En raison 
du type elliptique de cette derniére équation, cela suffit pour conclure que 
G(a, p) est holomorphe pour toutes les valeurs réelles de « et toutes les valeurs 
positives de p. 


6. La mesure superficielle de la courbe C. L/objet de ce paragraphe 
est de démontrer le théoréme suivant. 


THEOREME 12. La courbe C est un ensemble de points dont la mesure 
superficielle est presque stirement nulle. 


Le 1° de ce paragraphe est consacré 4 un résultat préliminaire ; le 2° con- 
tient la démonstration du théoréme 12. Le 3° et le 4° contiennent des remarques 
qui nous semblent de nature 4 faire comprendre, mieux peut-étre que la 
démonstration, la véritable nature de ce théoréme, et en tout cas préparent les 
généralisations qui seront l’objet du paragraphe suivant; le 3° contient en 
outre un théoréme important par lui-méme; il donne une condition nécessaire 
pour qu’une courbe remplisse une aire. 


1°. Il suffit de considérer un are A(0)A(t) de la courbe C. Comme 
c’est un ensemble fermé, il a une mesure superficielle bien déterminée p(t). 


di 


d’ 


al 
de 
d 
§ 

a 

ir 
a 

di 
ré 
pe 
ne 
et 
P 
el 
d 
d 
fe 
q 


LE MOUVEMENT BROWNIEN PLAN. 533 


Nous nous proposons d’abord de démontrer que »—=p(1) est une variable 
aléatoire [donc aussi »(t)]. En d’autres termes, » est une fonction mesurable 
de la variable U dont le choix au hasard entre zéro et un équivaut au choix 
de C, la notion de probabilité équivalant 4 la mesure sur l’axe des U. 

La démonstration repose sur ce que p est la limite en probabilité d’une 
suite de variables aléatoires; on sait qu’une telle limite est une variable 
aléatoire. 

Désignons a cet effet par pn(p) la mesure de l’ensemble des points 
intérieurs 4 l'un au moins des n-+ 1 cercles de centres A(0),A(1/n), 
A(2/n),- + -,A(1), et de méme rayon p. C’est évidemment une variable 
aléatoire ; nous allons montrer que, p tendant vers zéro, si n est une fonction 
de p de croissance assez rapide, pn(p) tend en probabilité vers »;?° il en 
résultera que p est une variable aléatoire. 

D’une part pn(p) est borné supérieurement, d’une maniére non aléatoire, 
par #(p), mesure du lieu des points M dont la distance 8(J/) 4 are A(0)A(1) 
ne dépasse pas p. Or, en vertu d’une propriété connue des ensembles bornés 
et fermés, ~(p) tend vers » quand p tend vers zéro. 

Désignons d’autre part par ly la longueur du v*™e cdté de la ligne 
polygonale A(0)A(1/n)A(2/n): + -A(1), et par pn le plus grand de ces 
cotés. On a évidemment 


Pr{pn > p} S > p} = nen"? €(p) 


et il suffit que n croisse assez rapidement quand p tend vers zéro pour que e(p) 
tende vers zéro. Or, quand pn <p, ensemble des n+ 1 cercles considérés 
contient la courbe a son intérieur, et pn(p) =p; il en est donc ainsi sauf dans 
des cas dont la probabilité est au plus égale a «(p). 

Finalement, pn(p) est au moins égal a p, sauf dans des cas dont la 
probabilité tend vers zéro, et dans tous les cas au plus égal 4 p(p), qui tend 
vers 4; la convergence en probabilité de pn(p) vers p est ainsi établie. 

On démontre de la méme maniére que, pour tout point M, 6(), distance 
de ce point 4 la courbe, est une variable aléatoire ; la probabilité que 6(M) =0, 
cest-a-dire que M soit sur la courbe, est aussi bien déterminée, et est une 
fonction mesurable ¢(J/) du point M. Ces fonctions 6(M) et ¢(M) sont en 
effet limites de celles obtenues en remplacant la courbe par V’ensemble des 
cercles considérés dans le raisonnement précédent. 


La condition que n croisse assez rapidement joue un réle essentiel dans la 
démonstration du texte. Mais. une fois le théoréme 12 établi, on constate aisément 
qu’elle est inutile; w,,(p) tend en probabilité vers u quand p tend vers zéro, et cela 
@une maniére uniforme par rapport a n. 


534 M. PAUL LEVY. 


2°. Pour démontrer que p est presque sfirement nul, observons d’abord 
que sa valeur probable m est finie. Cela résulte aisément de ce que 


> < Pr{ Max | X(t)|,| Y()| > S aye 


Par suite, d’aprés le caractére d’homogénéité stochastique de la courbe, 
les mesures des arcs A(0)A(1), A(1)A(2), et A(0)A(2), arcs que nous 
désignerons respectivement par Ci, C, et C’, ont respectivement pour valeurs 
probables m, m, et 2m. Comme on a évidemment 


€{mes C’} = E{mes C1} + €{mes C.} — E{mes C,C2} 
(CiC, étant l’ensemble des points communs 4 C, et Cz), il en résulte que 


€{mes C,C>} 0, 


c’est-d-dire que la mesure de C,C2 est presque siirement nulle. 

Nous allons en déduire que p» est presque sirement nul. Observons A cet 
effet qu’on ne change rien a » en supposant A(1) connu et en construisant C, 
et C’, en partant de ce point: ce sont deux déterminations indépendantes l’une 
de l’autre et dépendant de la méme loi de probabilité. Les probabilités qu’un 
point M appartienne a C;, C2, et CiC, sont alors ¢(M), et ¢?(MN), 


et l’on a 


(les intégrations étant étendues 4 tout le plan) ; uw’ =0 entraine donc p = 0; 
Pun et autre équivalent 4: ¢(M/) est presque partout nul. 

Le théoréme 12 est ainsi démontré. On remarque le réle que joue, dans 
la derniére partie du raisonnement, l’indépendance stochastique de C; et C2 
[une fois le point A(1) connu]. I] importe d’avoir ce point présent a l’esprit 
pour des extensions que nous indiquerons plus loin sans reprendre tout le 
raisonnement. 

Observons d’autre part que (J), qui est évidemment une fonction y(r) 
de la distance r du point M au point A(1), est non seulement presque partout 
nul, mais est nul pour tout r positif. Pour le démontrer, il suffit de démontrer 
que c’est une fonction non croissante de r; cela résulte évidemment de la 
similitude stochastique des arcs A(0)A(1) et A(1— k*)A(1) ; d’apres cette 
similitude, un point situé a la distance kr du point A(1) a la probabilité y(7) 
d’appartenir au second de ces arcs, et par suite, si k << 1, une probabilité 
y(kr) = y(r) dappartenir au premier. On a donc bien y(kr) = y(r), donc 
w(r) =0 pour tout positif. 


LE MOUVEMENT BROWNIEN PLAN. 53d 


Quant au point A(1) lui-méme, il est évidemment sur la courbe; mais 
on voit aisément que la probabilité qu’il soit double est nulle. Quoi qu’il y 
ait une infinité non dénombrable de points doubles, les points doubles sont sur 
(' des points exceptionnels ; leurs cotes constituent un ensemble de mesure nulle. 


3°. Indiquons maintenant une condition nécessaire pour qu’une courbe 
remplisse un ensemble de mesure superficielle positive. 


THEOREME 13. Pour qu'une courbe continue remplisse un ensemble de 
mesure superficielle positive m, il faut que, pour des points de division con- 
venablement choisis, mais arbitrairement denses sur la courbe, la somme 


(34) B, = 3(Al)? 


soit au moins égale ad cm, ¢ étant une constante (supérieure a 33/7; nous 
ne connaissons pas la meilleure valeur possible pour cette constante). 


Quand nous disons que les points de division sont arbitrairement denses, 
nous voulons dire que, quelle que soit la suite de nombres croissants entre 
zero et un, to = 0, ti, = 1, il y au moins un des point considérés 
dans chacun des arcs A (tn-1)A (thn). 

Désignons par pax la mesure superficielle de cet arc A(ts-1)A(t). Ona 
évidemment Sp, = m. 

Or si un ensemble fermé a une mesure superficielle ys», on peut trouver 
dans cet ensemble deux points dont la distance soit au moins égale au diamétre 
An du cercle d’aire ps. On vérifie en effet facilement qu’un contour dont 
aucune corde n’atteint cette longueur ne peut pas entourer une aire égale ou 
supérieure & pa. 

Soit done A(t’) A(tv’) une corde de A(ti.)A(ti) de longueur au 
moins égale 4 An; on peut supposer fr’ < tn”. Pour la ligne polygonale 


A(0)A(ty’) A(t”) A(t) A(t’) A(t”) ACL) 
inscrite dans C’, la somme (34) est au moins égale a 
T T 


ce qui établit le théoréme 13, sauf en ce qui concerne la valeur de la constante. 

On peut améliorer la valeur de cette constante, et en méme temps obtenir 
un autre résultat d’un certain intérét, en mettant en évidence trois points 
A(t’) A(t’) de chaque are A(tnr+)A(tn). On voit aisément qu’on 
peut toujours les choisir de maniére que l’aire du triangle dont ils sont les 
sommets soit au moins égale a cy, (c’ = 33/4); dans le cas d’une aire 


536 M. PAUL LEVY. 


circulaire, cette valeur représente l’aire du triangle équilatéral inscrit, et ne 
peut pas étre dépassée; en dehors des cas des aires circulaires elliptiques, elle 
peut sirement étre dépassée. 

Les triangles 


A(t’)A(t”) A(t”), A(t") A(t:) A(t’), A(t’) A(t2”) 


forment alors une chaine de triangles inscrits, analogue 4 l’aire S’n du § 5, 1°, 
et dont l’aire totale est au moins c’Sp, = c’m. II s’agit, bien entendu, de la 
somme des aires de ces triangles, prises en valeur absolue. L’existence d’une 
telle chaine pour laquelle celte somme soit au moins égale a c’m est donc une 
condilion nécessaire pour que la courbe remplisse un ensemble de mesure 
superficielle au moins égale a m. 

Considérons alors la ligne polygonale inscrite dans C' ayant pour sommets 
tous ceux de ces triangles. Dans un triangle, la somme des carrés de deux 
cotés est au moins égale 4 quatre fois la surface. La somme B, relative a 
cette ligne polygonale est donc au moins égale 4 4 c’m; on obiient ainsi la con- 
stante 4c’ = 3. 3/m indiquée dans l’énoncé du théoréme 13.: II est d’ailleurs 
évident qu’elle n’est pas la plus grande valeur possible pour la constante ¢ de 
ce théoréme. II serait intéressant de déterminer cette valeur maxima; il ne 
nous a pas paru au contraire utile d’allonger les raisonnements pour obtenir 
une valeur un peu plus grande que 3\/3/z, mais qui ne serait pas la valeur 
maxima. 

Indiquons d’autre part sans démonstration que, si l’on introduit le hasard 
dans le choix des points de division comme nous l’avons fait pour définir la 
notion d’oscillation brownienne, au moins pour une représentation para- 
métrique convenable de la courbe étudiée (la probabilité étant mesurée par la 
variation du paramétre), et pour les valeurs assez petites de c, on a 


lim sup Pr{B, = cm?} =a > 0. 


Les modes de division qui réalisent la condition By, = cm? n’apparaissent 
donc pas comme exceptionnels. La probabilité « ne peut que croitre quand 
on prend pour c des valeurs de plus en plus petites; mais il n’est pas du tout 
certain qu’elle varie d’une maniére continue; on peut se demander si |’on 
n’est pas en présence d’un de ces cas, fréquents dans la théorie des probabilités 
dénombrables, oi la probabilité ne peut pas étre comprise entre zéro et un; 
elle passerait brusquement d’une de ces valeurs 4 l’autre pour une valeur 
déterminée de c. 

Revenant aux résultats établis d’une maniére sfire, nous voyons qu’il s’en 
dégage deux idées. L’une c’est que: l’aire totale des chaines de triangles 
inscrits analogues @ Vaire S’n du §5,1° est en quelque sorte une approxima- 


t 
u 
m 
ec 
pi 
fa 
9 
pe 
sp 
la 
de 
tr 
pe 
sa 
de 
a 
le 
re 
re 
de 
dé 
ce 
de 
po 
el 
po 
sé} 
ex 
co 
in 
su 
ég 
qu 


LE MOUVEMENT BROWNIEN PLAN. 537 


tion de la mesure superficielle de la courbe; pour que la courbe puisse remplir 
une aire, il faut qu’elle ne soit pas trés petite. D’autre part il faut que, au 
moins pour une représentation paramétrique convenable, la longueur Al de la 
corde A(t)A(t-+ dt) soit en général de Vordre de grandeur de V dt ou plus 
grande, ce qui entraine cette conséquence indépendante de la représentation 
paramétrique que B, n’est pas trés petit; si Bn nest pas trés petit, la courbe 
fait assez de délours infiniment petits pour pouvoir remplir une aire, et les 
grandes valeurs prises par Bn, pour n infini, mesurent assez bien ce qu'on 
pourratt appeler la possibilité pour la courbe de remplir une aire. 


4°. Introduisons maintenant une nouvelle idée. Nous considérons 
spécialement les courbes pour lesquelles, comme pour le mouvement brownien, 
la corde de Vare décrit pendant le temps dt est de Vordre de grandeur de V dt, 
de sorte que les sommes B, relatives 4 un are fini ne sont, ni trés petites, ni 
tres grandes. La courbe fait ainsi exactement assez de détours infiniment 
petits pour pouvoir remplir une aire. Mais ce n’est qu’une condition néces- 
saire, non suffisante: pour que la courbe remplisse exactement une aire, il faut 
de plus une organisation de ces détours infiniment petits que le hasard n’a 
aucune chance de produire. Seule une loi mathématique précise peut guider 
le cheminement du point mobile dans des zénes déja en grande partie 
recouvertes de maniére qu’il ne se déplace que dans les vides, et finisse par les 
remplir. Les courbes T, et T, dont nous parlerons au paragraphe suivant 
donnent des exemples de cette circonstance. 

Lorsque le hasard joue un role suffisant, si la courbe comporte assez de 
détours infiniment petits pour remplir une aire m, on doit donc s’attendre a 
ce qu’elle remplisse seulement une aire m’ = m/k < m, les différentes parties 
de cette aire étant en moyenne remplies k fois (k > 1); si & est fini, m’ est 
positif; si & est infini, m’ est nul. D/’aprés le théoréme 12, c’est la seconde 
circonstance qui est réalisée. Nous allons présenter quelques remarques qui 
pourraient conduire 4 une nouvelle démonstration, mais qui, sous la forme 
résumée que nous leur donneront, ne sont que des raisons intuitives assez 
sérieuses de croire que k est infini. 

A cause de la similitude stochastique des différents arcs de courbe, au lieu 
Vexaminer un méme arc a des échelles de plus en plus petites, nous pouvons 
examiner des arcs de plus en plus grands 4 une échelle déterminée. Cela nous 
conduit par exemple 4 étudier la ligne brisée A(0)A(1)A(2)- 
indéfiniment prolongée ; nous supposerons les sommets de cette ligne marqués 
sur une feuille de papier quadrillé, les cdtés des carrés du quadrillage étant 
égaux 4 ’unité de longueur ou un peu plus petits que cette unité, de maniére 
que deux points consécutifs A(n) et A(n + 1) aient des chances appréciables 


6 


538 M. PAUL LEVY. 


de ne pas étre dans le méme carré. Montrons d’abord qu’il y a une probabilité 
unité pour que le carré du quadrillage qui contient A(0) contienne une infinité 
d’autres sommets de cette ligne brisée. 

I] suffit 4 cet effet de montrer que, quel que soit A(n) supposé connu, 
on peut déterminer N de maniére que l’un au moins des points A(n + 1), 
A(n+2),---,A(n+WN) soit dans le carré du quadrillage qui contient 
A(0), et cela dans des cas dont la probabilité ne soit pas trés petite. En effet, 
sauf dans des cas peu probables, ces V points sont a une distance de A(n) ne 
dépassant pas cV N, c étant une constante convenablement déterminée. Ils 
ne peuvent donc se répartir qu’entre des carrés du quadrillage dont le nombre 
ne dépasse pas 7(c VN + V2), soit sensiblement xc?N, et, pour chacun de 
ces carrés on peut borner inférieurement la probabilité qu’il contienne un de 
ces points (comme il s’agit de principes bien connus, nous n’insistons pas sur 
les détails de la démonstration). I] suffit alors que cVN dépasse la distance 
A(0)A(m) pour que cette conclusion s’applique au carré du quadrillage con- 
tenant A(0); il y aura une probabilité supérieure 4 un nombre fixe @ qu'il 
contienne un point A(v) d’indice compris entre n et n + N. 

On peut alors sirement déterminer une suite d’entiers croissants 
M1, M2,°**,Mn,***, tels que, une fois les m, premiers points A(v) connus, il 
y ait une probabilité supérieure 4 « qu’un des Na = mays — a Suivants soit 
dans le carré du quadrillage qui contient 'A(0) ; Nn, dépendant du point A(m), 
est aléatoire, mais sfirement borné. On sait que, dans ces conditions, il est 
presque sir que l’on obtiendra indéfiniment des points A(v) situés dans le 
carré qui contient A(0).*® 

Le méme résultat s’applique naturellement 4 n’importe quel carré du 
quadrillage, et l’on en conclut aisément que l’ensemble des points A(n) forme 
presque stirement un ensemble partout dense dans le plan; il en est de méme, 
a fortiori, de la ligne polygonale ayant ces points pour sommets, et de la courbe 
C elle méme. 

On peut alors se représenter de la maniére suivante l’aspect de cette ligne 
polygonale limitée 4 ses n premiers cétés, n étant grand. La plus grande 
distance de deux de ses points sera de l’ordre de grandeur de Vn, et elle ne 
recouvrira certainement pas avec une grande densité la plus petite région 
convexe qui l’entoure; il y aura des vides, et il y aura des parties de cette 
région R ow la ligne considérée ne passe qu’une fois ou un petit nombre de fois. 
Mais le fait que les remarques précédentes s’appliquent 4 n’importe quel carré 


1° Les remarques que nous venons d’exposer dans les deux derniers alinéas repro 
duisent & peu prés des considérations exposées par M. G. Pélya dans une conférence 
faite au Colloque sur les principes du calcul des probabilités tenu & Genéve en octobre 
1937 et publiée chez Hermann. 


[ 
a 
P 
u 
a 
t 
a 
ne 
cc 
est 
av 
pr 
on 
un 


LE MOUVEMENT BROWNIEN PLAN. 539 


du quadrillage intérieur 4 cette région et dont on sait qu’il contient au moins 
un sommet A (v),** prouve que la plupart des carrés qui contiennent un sommet 
en contiennent un grand nombre. Si alors, pour avoir une idée de l’aire 
recouverte, on considére, soit l’ensemble des carrés du quadrillage contenant 
au moins un sommet, soit la chaine des triangles A(2v)A(2v-+1)A(2v+ 2), 
on voit que l’aire ainsi définie ne sera pas, comme on pourrait s’y attendre a 
premiére vue, de l’ordre de grandeur de n; elle sera petite par rapport a n, 
et composée en grande partie de régions recouvertes un grand nombre de fois. 
Il en résulte nécessairement qu’il y a des vides, dont les plus grands seront 
une fraction non négligeable de la région F,'* et qui seraient seulement 
recouverts par le prolongement de la ligne polygonale étudiée au dela de ses 
n premiers cotés. 

Utilisons maintenant la similitude stochastique des différents arcs de la 
courbe C. Les résultats précédents peuvent s’appliquer 4 l’étude de la ligne 


A(0)A(1/n)A(2/n) -A(1), 


pour une valeur trés grande de n. Nous trouvons d’abord un résultat qui 
rejoint les remarques du $3, 5°, im fine: la courbe n’atteint un point, en 
général, qu’aprés avoir passé prés de lui un grand nombre de fois; la distance 
A(t)A(¢-+ 7), quand 7 tend vers zéro, est en général de lordre de grandeur 
de Vr; mais elle est parfois plus grande et parfois plus petite, ce qui donne 
i la courbe l’aspect d’une succession de bouches de plus en plus petites et de 
plus en plus voisines du point A(t); 4 une échelle excessivement petite, on 
pourra voir le point A(¢-+ 7) s’approcher de A(¢), puis s’en éloigner, et, cela 
un grand nombre de fois avant que la distance A(t)A(t-+ 7) cesse 1’étre 
appréciable. 

D’autre part, en ce qui concerne l’aire, nous voyons qu’une chaine de 
triangles inscrits comme celle désignée au § 5 par S’n, bien que la somme des 
aires de ces triangles prises en valeur absolue ait pour n infini une limite 
positive, ne recouvre qu’une aire de plus en plus petite, mais recouverte un 
nombre de fois de plus en plus grand. Cette aire pouvant étre considérée 
comme une approximation de celle recouverte par la courbe, on est conduit a 
conclure que la mesure ‘7 orficielle de la courbe est nulle. Une extension 
convenable du théoréme | permettrait de rendre ce raisonnement rigoureux, 


La nécessité de cette restriction est évidente: un point pris au hasard dans R 
est &4 une distance de A(0) qui est de l’ordre de grandeur de Vn. Ce n’est done qu’aprés 
avoir placé un nombre de sommets A(v) grand par rapport & m qu’on a une grande 
Probabilité d’en trouver un qui soit voisin du point donné. 

**Tl faut bien en effet que pour un point pris au hasard dans les régions vides 
OM puisse appliquer le raisonnement de la note précédente. Or on ne le peut pas pour 
un point qui serait & une distance d’un des A(v) petite par rapport a Vn. 


540 M. PAUL LEVY. 


et conduirait 4 une nouvelle démonstration du théoréme 12. La démonstration 
initiale est évidemment plus simple; mais les remarques qui précédent nous 
ont paru utiles pour montrer que: la courbe C, tout en comportant assez de 
détours infiniment petits pour recouvrir une aire, a cependant une mesure 
superficielle nulle, parce que Vallure désordonnée du point mobile ne permet 
pas le bulayage méthodique d’une aire; il est infiniment peu probable que ce 


balayage soit réalisé. 


7. Généralisations diverses. 1°. Un des caractéres essentiels du mouye- 
ment brownien est la similitude stochastique de deux arcs quelconques de 
trajectoire. Ce caractére est indépendant du nombre de dimensions de |’espace 
considéré, et subsiste par une transformation affine, c’est-a-dire qu’a a la loi 
de Gauss isotrope on peut substituer la loi de Gauss non isotrope. Mais 
les lois stables autres que celles de Gauss conduisant 4 des courbes presque 
sirement discontinues, il ne semble pas que l’on puisse trouver d’autres 
schémas présentant ce caractére et conduisant 4 des courbes. continues.’® 

Par contre il est facile de définir des schémas trés variés pour lesquels la 
courbe décrite quand ¢ varie de zéro 4 un est une réunion d’arces stochastique- 
ment semblables 4 la courbe entiére. Pour nous limiter, nous n’étudierons 
que les courbes pour lesquelles les deux arcs A(0)A(4) et A(4)A(1) sont 
stochastiquement semblables a la courbe entiére; chacun de ces arcs se décom- 
posant 4 son tour dans les mémes conditions, et ainsi de suite, nous voyons 
que chacun des ares A(h- 2-") A[(h + 1)2-] est stochastiquement semblable 
a la courbe entiére. Les points dont les cotes sont de la forme h-2™ sont 
alors des points particuliers de la courbe; l’allure de la courbe en un tel point 
ne ressemblera pas 4 son allure en un point quelconque. Les lignes poly- 
gonales L’,, ayant ces points pour sommets, et les chaines de triangles inscrits 
désignées par S’, au début du §5 se distingueront essentiellement des autres 
lignes polygonales inscrites et des autres chaines de triangles inscrits; on doit 
s’attendre 4 trouver pour les L’n et les S’n des propriétés simples non sus- 
ceptibles d’étre étendues sans modification aux autres lignes inscrites L, et 
aux autres chaines de triangles inscrits. 

D’autre part, ce qui n’était pas possible (en dehors du cas du mouvement 
rectiligne et uniforme) lorsqu’on exigeait la similitude de n’importe quel arc 
de courbe avec la courbe entiére, devient ici possible: il peut s’agir de simili- 
tude véritable, et non de similitude stochastique. On retrouve ainsi des 
courbes dont nous avons fait une étude systématique dans un mémoire récent 
(Journal de l’Ecole Polytechnique, 1938); deux de ces courbes seront con- 


19En tout cas cela est évident si l’on se borne aux schémas pour lesquels les 
déplacement successifs du point mobile sont stochastiquement indépendants. 


| 
] 
( 
( 
( 
| 
t 
t 
st 
( 
te 
t 
d 
n 
L 
ir 


LE MOUVEMENT BROWNIEN PLAN. D41 


sidérées dans la suite, et désignées par Tp et T,; T, est la courbe bien connue 
qui remplit l’aire d’un triangle rectangle isocéle; pour la courbe Ty, nous 
renvoyons 4 notre mémoire de 1938 pour la démonstration de ses principales 
propriétés. 


2°. Nous allons considérer en premier lieu les courbes I pour lesquelles 
le triangle A(0)A(4$)A(1) est un triangle rectangle isocéle dont A (0) (1) 
est Vhypoténuse. Chacun des 2” triangles qui constituent l’aire S’, sera aussi 
un triangle rectangle isocéle; si ’on prend A(0)A(3$) pour unité de longueur, 
les cotés de Vangle droit de chacun de ces triangles auront la longueur 
(a= 1/2). L’hypoténuse étant placée, le sommet de l’angle droit a deux 
positions possibles, et l’aire du triangle, égale en valeur absolue 4 2-‘"*"’ pour 
les triangles de S’n, sera positive ou négative suivant le sommet choisi. La 
courbe sera donc bien définie par la donnée d’une succession de signes; nous 
désignerons par en” = +1 le signe lié au hi*™e triangle de S’,. Nous 
supposerons €9‘*) = 1, ce qui n’est pas une restriction essentielle. 

Nous étudierons spécialement les deux courbes non aléatoires 1, et Ty 
définies, la premiére par = 1, la seconde par = (—1)", et les deux 
courbes aléatoires T, et T' pour chacune desquelles les signes seront déterminés 
par des tirages au sort 4 chances égales pour les deux signes; mais pour I, 
le signe ne dépendra que de n et un méme tirage au sort déterminera l’orienta- 
tion de tous les triangles de S’n; pour T; il y aura un tirage au sort pour 
chaque triangle. 

sien entendu, des régles quelconques ne donneraient pas des courbes 
composées d’ares stochastiquement semblables 4 la courbe entiére. L’énuméra- 
tion compléte des courbes T' pour lesquelles il y a similitude (effective, ou 
stochastique ) entre chacun des ares A(0)A(3) et A($)A(1) et la courbe entiére 
serait assez longue. I] existe en outre des courbes [ composées de quatre 
(ou huit, ou seize) arcs stochastiquement semblables 4 la courbe, et non deux; 
tel serait le cas si l’on admet qu’un méme tirage au sort détermine l’orienta- 
tion des triangles de S’, pour deux (ou trois, ou quatre) valeurs consécutives 
de n. 

Pour la courbe T>, l’aire totale des triangles de S’n, qui sont tous orientés 
positivement, a la valeur 4, aire du triangle initial. Les différents triangles 
de §’, ne se recouvrent jamais, de sorte que V’aire totale de S’n est $. Cela 
conduit 4 penser que la courbe T) recouvre une aire égale a 4; c’est ce que 
nous avons démontré dans le travail cité tout 4 Vheure. 

D’autre part l’aire + 8’: ++ S’n-+, aire comprise entre L’y et 
L’, comptée en affectant chacune de ses parties d’un coefficient numérique qui 
indique combien de fois elle est entourée, est égale 4 n/2; elle augmente 
indéfiniment, et aire comprise entre la courbe T, et sa corde est infinie. 


542 M. PAUL LEVY. 


On se l’explique bien en observant que la courbe est composée de boucles qui 
tournent toujours dans le méme sens. On en déduit aisément que, pour un 
choix quelconque des points de division, on aurait toujours une aire infinie, 

Dans le cas de la courbe T,, chacune des aires S’, recouvre exactement 
Vaire du triangle initial; 4 la limite, la courbe remplit le triangle: pour I, 
et T,, une loi mathématique précise, pour des raisons évidentes dans le cas de 
I’, et beaucoup plus cachées dans le cas de To, réussit 4 faire ce que le hasard 
ne peut pas faire: la courbe remplit une aire sans qu’aucune partie de cette 
aire soit recouverte plus d’une fois. 

En tenant maintenant compte des signes, l’aire comprise entre L’, et L’, 
se présente sous la forme 


de sorte qu’elle est égale 4 zéro si n est pair et 4 4 si n est impair. On se rend 
bien compte de ce fait, géométriquement, en observant qu’aprés suppression 
de segments rectilignes dont chacun est parcouru une fois dans chaque sens, les 
lignes L’, se réduisent 4 L’) ou a L’,, suivant la parité de n. L’aire comprise 
entre la courbe T, et le segment initial A(0)A(}$) apparait ainsi comme 
indéterminée entre zéro et un. 

Pour la courbe [2, aire S’n a la valeur «,/2. L’aire comprise entre la 
courbe et le segment initial A(0)A(1) est alors comparable au gain d’un 
joueur dans une partie de pile ou face indéfiniment prolongée; elle est indé- 
terminée, non entre deux limites fixes, mais entre — « et + «. D’autre part 
un raisonnement identique a celui fait 4 propos de T, dans notre mémoire 
cité ci-dessus permet de montrer que les différents triangles d’une méme aire 
S’, ne se recouvrent pas: si l’on part d’un réseau de triangles recouvrant le 
plan, chaque succession de signes €:, €2,° °°, €, conduit 4 un réseau d’aires S's 
recouvrant exactement le plan une fois et une seule, et, a la limite, on obtient 
un réseau de courbes I, infiniment enchevétrées les unes dans les autres, mais 
recouvrant le plan une fois et une seule; il y a lieu de penser que chacune 
recouvre une aire égale a celle du triangle initial.”° 

Le fait que le méme tirage au sort définisse les orientations de tous les 
triangles d’une méme aire S’» suffit 4 constituer cette loi précise qui fait ce 
que le hasard ne saurait faire: deux triangles de S’n ne peuvent pas se 
recouvrir. 

Il n’en est plus de méme pour la courbe Ts, dans la définition de laquelle 
le hasard joue un role beaucoup plus grand. D’abord chaque aire S’n, compte 
tenu des signes de ses triangles, est assimilable au gain d’un joueur apres 
2" coups de pile ou face, l’enjeu 4 chaque coup étant 2°». C'est une 


20 Qn démontre du moins aisément que chacune a une mesure superficielle au mois 
égale & celle de ce triangle. 


| 
| 
| 
| 
e 
a 
d 
q 
A 
§ 
p 


LE MOUVEMENT BROWNIEN PLAN. 543 


variable asymptotiquement gaussienne, dont l’écart quadratique moyen est 
(q=1/V 2). La série 3S’, est donc une série termes indépendantes, 
qui converge en moyenne quadratique, donc presque sirement. L/’aire S com- 
prise entre la courbe est sa corde est done stochastiquement bien définie, dans 
les mémes conditions que pour le mouvement brownien ; on peut aussi montrer 
qu’avec des points de division choisis au hasard il y a, dans les mémes con- 
ditions que pour le mouvement brownien, convergence presque sire vers la 
méme limite S; mais il ne s’agit pas d’une aire définie au sens de Riemann. 

Comme c’est une somme de termes aléatoires indépendants, on définit 
facilement, par sa fonction caractéristique, la loi dont elle dépend. Cette 
fonction caractéristique est 


00 \* sing 4» \™ 
(55) HI ( cos 35) ( sin 


La deuxiéme expression, correspondant 4 un groupement évident des facteurs 
de la premiére, donc aussi au groupement correspondant des triangles dont S 
est la somme, montre que S est la somme de variables indépendantes ayant 
chacune une fonction caractéristique de la forme (A/z) sin 2/A, c’est-a-dire que 
cette variable est choisie arbitrairement entre —A et + A avec une répartition 
uniforme de la probabilité. Elle dépend ainsi d’une loi absolument continue, 
et il en est de méme de S. 

Montrons maintenant que: la mesure superficielle de la courbe YT; est 
nulle. Le principe du raisonnement est la méme que dans le cas du mouve- 
ment brownien (§6, 1° et 2°). Mais ici, au lieu d’un facteur ¢(M) qui 
intervient deux fois, il faut introduire deux facteurs ¢;(M) et $2(M) qui 
représentent respectivement les probabilités que M appartienne aux arcs 
A(0)A(4) et A(4)A(1); ils sont respectivement égaux a y(r,6) et 
7/2 — 6),  désignant la distance A(4)M et 6 Vangle A(0)A(4)M. On 
sait que le produit w(r, 0)y(r, 7/2 —@) est presque partout nul; il s’agit de 
montrer que chacun des facteurs est presque partout nul. Ce qui était évident 
lorsque les deux facteurs étaient égaux ne lest plus ici. 

Mais un artifice trés simple va nous permettre d’arriver au résultat. Il y 


a une chance sur quatre pour que = 1; dans ce cas les points 
A(4) et A(#) coincident, et les arcs A(4)A(3) et A($)A(#) sont deux 
déterminations indépendantes d’une méme courbe aléatoire, et la probabilité 
quun point M appartienne 4 l’un ou 4 l’autre de ces arcs a une méme valeur 
¢(M). Si alors T avait une mesure superficielle positive, et cela dans des 
cas de probabilité positive, il y aurait aussi une probabilité positive que les arcs 
A(4)A(4) et A(4) A(#), stochastiquement semblables aient des mesures 
superficielles positives, et que de plus A(4) et A(#) coincident. L’indé- 
pendance de ces arcs, une fois les points A(4), A(4) et A(#) placés, permet 


544 M. PAUL LEVY. 


de terminer le raisonnement presque comme dans le cas du mouvement 
brownien: la mesure de l’ensemble des points communs aux deux arcs con- 
sidérés pourrait étre positive, dans des cas de probabilité positive. I] en serait 
de méme des points communs aux arcs A(0)A($) et A(4)A(1). Or, par la 
premiére partie du raisonnement, qui subsiste sans modification, on sait que 
e’est impossible; ce qui établit le résultat annoncé. 

On voit que, si ce résultat a pu étre obtenu, c’est parce que la part du 
hasard est bien plus grande que pour la courbe [.; pour cette courbe, les arcs 
A(4)A(4) et A($)A(#) sont égaux; pour T, ils sont stochastiquement 
indépendants [une fois le point A(4) placé]; cette indépendance joue un rile 
essentiel dans le raisonnement qui précéde. 


3°. Pour terminer l’étude des courbes T, nous allons présenter quelques 
remarques relatives 4 la somme 
(34) By == (Al)?, 


qui, étendue aux cdtés d’une des lignes L’n, a la valeur non aléatoire 2. Si on 
’étudie dans le cas d’une ligne L, dont les sommets sont ‘choisis au hasard 
entre zéro et un, des considérations analogues a celles exposées 4 propos du 
mouvement brownien montrent que, si les points de division, une fois choisis, 
sont conservés, il est presque stir que pour n infini, B, est infiniment peu 
différent de sa valeur probable.*t Mais cette valeur probable n’est plus une 
constante; elle est de la forme P (logn) +e, P(A) étant une fonction 
périodique, et «, tendant vers zéro. La suite des Bn présentera donc, sur 
Véchelle logarithmique, des oscillations asymptotiquement périodiques. 

Pour établir ce résultat, considérons d’abord la valeur probable p? = ¢°(r) 
de (Al)?, AJ étant la longueur d’une corde pour laquelle A¢ a une valeur donnée 
r, et la cote ¢ de son origine étant choise au hasard entre 0 et 1—t#. Siz devient 
deux fois plus petit, »* devient 4 peu prés deux fois plus petit;. on obtient 
évidemment ¢*(7)/2 comme valeur probable de (Al)? pour At = 7/2 et 
choisi au hasard entre 0 et (1—-7)/2 ou entre 4 et 1—7/2. La valeur 
probable de (A/)*, pour At == 7/2 et ¢ choisi au hasard entre 0 et 1—7/2 
sera donc de la forme ¢7(7)/2 [1+ O(7)] {on le voit aisément en observant 
que Al est toujours O[¢(7)]}. Si Pon donne alors successivement a 7 les 


valeurs 7, 7/2, 7/4,° on voit que (2”/r)*(7/2”) tend vers une limite, pour 
p infini, ce qui revient a dire que 
(56) + e«(r), 

T 


e(r) tendant vers zéro avec 7, et P, ayant la période log 2. 


21 Pour Jes schémas aléatoires r. et r,, il est bien entendu qu’il s’agit de la valeur 


~~ 


Z 

i 

d 

€ 

t 

C 

( 

q 

P 

e 

le 

0 
| 


LE MOUVEMENT BROWNIEN PLAN. 545 


Considérons maintenant les points ¢,, - choisis au hasard entre 
zero et un, qui sont les cotes de sommets de Zn. On sait que chacun des 
intervalles At — 7 séparés par ces points dépend de la loi définie par 

Pr(nr > 2) = (1—£) (n—> 
et que, si n est grand, les différentes valeurs possibles de nv sont réalisées avec 
des fréquences trés probablement trés peu différentes de leurs probabilités (on 
peut méme préciser ce résultat au sens de la loi forte des grands nombres). 
Il y a d’ailleurs, asymptotiquement, indépendance stochastique entre lorigine 
t et la longueur 7 des intervalles considérés, de sorte que, pour un intervalle 
de longueur 7 connue, la valeur probable de (Al)? est bien 7(7). On en déduit 


que l’on a asymptotiquement 
oo 
€{B,} = P, (log 7) +- en 
0 
— =) +- en = P(log n) + én, 
0 


é, tendant vers zéro et P(logn) étant une fonction périodique de période 
log 2, c. q. f. d. 

On remarque que P(logn), étant une moyenne entre les différentes 
valeurs de P,(log#/n), ne varie qu’entre des limites assez voisines l’une de 
Vautre. Comme B,, si n est grand, différe trés probablement (méme presque 
sirement trés peu) de sa valeur probable, on ne peut pas parler d’une oscilla- 
tion brownienne bien définie comme dans le cas du mouvement brownien, mais 
cette oscillation est indéterminée entre deux limites voisines l’une de l’autre; 
cette indétermination n’a d’ailleurs pas un caractére aléatoire: B, différe en 
effet trés probablement (et méme presque stirement, dans les mémes conditions 
que pour le mouvement brownien) de la fonction non aléatoire P (log n).?? 

On peut d’ailleurs échapper 4 ces oscillations périodiques en modifiant le 
choix des points de division de la maniére suivante: nous choisirons un point 
de division, tj au hasard entre zéro et un, avec répartition uniforme de la 
probabilité; puis deux points ¢,"? et ¢, respectivement dans les deux inter- 
valles (0, fo) et (to, 1); puis quatre nouveaux points dans les intervalles ainsi 


probable a priori E{B,}, et que e’est en tenant compte a la fois du choix de la courbe 
et de celui des #, que nous disons que B,— €{B,}\ tend presque sfirement vers zéro. 

O’est donc par erreur que, dans ma Note du 12 décembre 1938, j’avais indiqué 
les courbes T. et Il, (désignées dans cette Note par C, et C,) comme modéles de 
mouvement brownien. Du moins il semble que ce soit une erreur. II n’y 4 premiére 
vue aucune raison de penser que la fonction périodique P,(log7) se réduise & une 
constante; mais je n’ai pas démontré que cette hypothése est exclue. On remarque 
ailleurs qu’il est a priori possible qu’elle soit constante pour I’, et variable pour I, 


ou inversement. 


546 M. PAUL LEVY. 


distingués, et ainsi de suite. Aprés p opérations analogues, on aura défini une 
ligne polygonal L,” 4 2” cétés, inscrite dans T. On peut penser que l’oscilla- 
tion périodique signalée ci-dessus disparait ici simplement parce qu’on ne con- 
sidére que des valeurs entiéres de p = log n/log 2; mais il se produit aussi 
une autre circonstance remarquable. Les n= 2? valeurs de += At corres- 
pondant aux cétés de Ly” ne sont pas ici pour la plupart de Jl’ordre de 
grandeur de 1/n; les n valeurs des produits nz se répartissent sur un intervalle 
beaucoup plus étendu que dans le cas précédent, et, pour n’importe quel 
intervalle fini sur l’échelle des log 1/7, la probabilité tend vers zéro pour n 
infini et tend a s’y répartir avec une densité constante. On aura alors a con- 
sidérer, au lieu de P(logn), une moyenne entre les différentes valeurs de 
P(log 1/r) qui se réduire a la limite a 
1 log 2 
P,(u)du, 

et les valeurs de B, = B,” correspondant aux lignes polygonales L,” tendent 
en probabilité, et méme presque sfirement, vers B. Il faut remarquer qu’il 
n’y a aucune raison de penser que B a la méme valeur 2 que dans le cas des 
lignes L,’; c’était une valeur particuliére tenant au réle particulier qui jouent 
les lignes Ly’ dans la définition des lignes T; ici il s’agit d’une moyenne, 
presque sirement réalisée dans les conditions ol nous nous sommes placés. 
On aurait d’ailleurs la méme valeur limite pour By si l’on partait d’une 
division initiale de l’intervalles (0,1) en h intervalles égaux (ou choisis au 
hasard), dont chacun serait subdivisé ensuite comme il vient d’étre indiqué. 
Dans les remarques qui précédent, on pourrait s’attendre a trouver comme 
limite, au lieu de la constante B, une fonction périodique de log h. II n’en 
est rien, et cette constante B semble donner, pour chacun des types de courbes 
Tr, une bonne mesure de ce qu’on peut appeler Voscillation brownienne 
généralisée; c’est une limite généralisée, ou limite en moyenne par rapport 4 
la variable log n, de la suite des B, obtenus par le premier des processus 
indiqués. 

Des considérations analogues, dans le cas de la courbe T,, peuvent s’ap- 
pliquer 4 l’aire comprise entre la courbe et sa corde; on peut définir une aire 
stochastique généralisée qui serait nécessairement égale 4 la moitié de l’aire 
du triangle initial.2* Dans le cas de la courbe Y., il y a presque siirement une 


221] faut remarquer que nous n’avons pas exclu l’hypothése qu’il y ait une aire 
stochastique non généralisée. Si c’était le cas, cela n’empécherait pas que pour des 
lignes polygonales inscrites L, convenablement choisies l’aire comprise entre L, ¢€ 
A(0)A(1), ne convergerait pas vers cette aire stochastique, et rien n’empéche de penser 
que les lignes L’,, soient précisément de telles lignes exceptionnelles. Disons seulement, 
en répétant une idée exprimée dans la note précédente, que le mode de définition de Ja 
courbe implique la périodicité sur l’échelle logarithmique, et que nous ne voyons aucune 


q 
A 

a 

le 

n 
ec 
0! 

T 

tr 

Li 

2 

es 

tir 
et 

le 

d 

e 

pe 
de 

la 

ra 
eff 

ét 


LE MOUVEMENT BROWNIEN PLAN. 547 


aire stochastique généralisée, mais variable avec cette courbe (tandis que 
Poscillation brownienne généralisée ne dépend pas du choix de la courbe). 


4°, Ktudions maintenant les courbes obtenues en prenant pour 
A(0)A(4)A(1) un triangle isocéle de base A(0)A(1) et d’angle au sommet 
a; nous les désignerons par ['(a@); pour «7/2 elles se réduisent a celles 
que nous venons d’étudier. Nous désignerons par Tn(a) (A = 0,1, 2,3) la 
courbe pour laquelle l’orientation des triangles des aires 8’, est définie comme 
pour la courbe Ty. 

Le rapport de similitude (effectif, ou stochastique) de chacun des ares 

1 
2 sin 
a>/2, on a g? < 4; il en résulte immédiatement que, si 7 est trés petit, la 
longueur des cordes A(t)A(t+ 7) est o( Vr); oscillation brownienne est 
nulle. La courbe a alors une mesure superficielle nulle. D’autre part l’aire 
comprise entre la courbe et sa corde est bien définie, au sens de l’analyse 
ordinaire. Il est inutile d’insister davantage sur ce cas simple; le cas ow 
a< 2/2, done g? > 3, est moins simple. I] faut bien entendu, pour que la 
suite des lignes L’, successivement définies convergent vers une courbe, que 
Yon ait g <1. Nous supposerons done maintenant « compris entre 7/2 et 
7/3, et étudierons la courbe [;(«) pour laquelle Vorientation de chacun des 
triangles de chaque aire S’, dépend d’un tirage au sort indépendant des autres. 

L’aire d’un triangle de S’, est + q?"s (s étant laire du triangle initial). 
Laire totale de S’,, compte tenu des signes, a pour valeur quadratique moyenne 
2n/2¢@°"s, La condition pour que la série 3S’, qui définit Vaire S, soit con- 
vergente en moyenne quadratique, et par suite presque sfirement convergente, 
est done 2q* < 1, c’est-i-dire a > a’, 2 étant langle compris entre 7/3 et 
7/2 pour lequel 8 sin* #’/2 1. Pour ces valeurs de a, l’aire S est stochas- 
tiquement définie; pour 2 <4’, la série 3S’, est essentiellement divergente, 
et ’on ne peut pas, méme par des procédés de moyennes, définir S. 

Au point de vue de la mesure superficielle de la courbe T;(%), les con- 
sidérations exposées 4 propos de T subsistent en ce sens que, si n est grand, 
les portions du plan recouvertes par S’n ont chance de ’étre un grand nombre 
de fois. Mais en méme temps la somme des aires des triangles de S’n, prises 
en valeur absolue, augmente proportionnellement 4 (2q7)"; la courbe fait done 
Vautant plus de détours infiniment petits, a d’autant plus de chances de 
pouvoir remplir une aire, que % est plus petit. On est done en présence de 
deux causes agissant en sens contraire, et l’on ne sait pas 4 premiére vue 
laquelle ’emporte. On peut seulement observer qu’une des causes varie avec 


A(0)A(4) et A(4)A(1) et de la courbe entiére est g = 


raison de penser que les oscillations que cette périodicité laisse prévoir n’existent pas 
effectivement dans l’étude de l’aire. Le calcul d’une moyenne sur un intervalle assez 
étendu les fait en tout cas disparaitre. 


548 M. PAUL LEVY. 


a, et cela ne semble pas étre le cas pour l’autre. D’autre part la probabilité 
que la mesure superficielle de T',(«) soit positive, n’étant pas modifiée par le 
résultat d’un nombre fini d’épreuves, ne peut étre que zéro ou un. I] y a done 
lieu de penser qu’il existe un nombre «” (peut-étre égal 4 a’) tel que cette 
probabilité soit nulle pour « (ou peut-étre «= a”) et égale 4 lunité 
dans le cas contraire. 


5°, Etudions maintenant un exemple de schéma dans lequel le rapport 
de similitude stochastique de chacun des arcs A(0)A(4) et A(4)A(1) sera 
aléatoire. Nous prendrons a cet effet pour A(4) un point choisi au hasard 
sur la circonférence de diamétre A(0)A(1), ou sur l’une ou l’autre des demi- 
circonférence limité 4 ce diamétre; de méme chaque triangle de chaque aire 
S’n sera un triangle rectangle ayant pour hypoténuse le cété de L’n qui lui 
sert de base. Pour mettre l’orientation de ces triangles en évidence, nous 
supposerons dans tous les cas qu’on choisisse le sommet indéterminé sur la 
demi-circonférence située 4 gauche de ce diamétre (les sommets de L’, étant 
parcourus dans le sens des ¢ croissants), et toujours avec une répartition 
uniforme de la probabilité sur cette demi-circonférence (on pourrait d’ailleurs 
adopter d’autres régles). On conservera le point choisi, ou bien on le rem- 
placera par le point symétrique, situé sur l’autre demi-circonférence, suivant 
le signe d’un nombre e,“” qui sera déterminé comme pour: les courbes I. 
Nous désignerons par I” les courbes ainsi obtenues, et par I’, 1%1, T’2, I”; les 
courbes, toutes aléatoires, qui correspondent respectivement aux courbes 
2, Ts, 

On remarque que, dans tous les cas, la somme By = = (Al)? étendue aux 
lignes L’,, est égale au carré de la longueur A(0)A(1), carré que nous sup- 
posons toujours égal 4 2. Au point de vue de oscillation brownienne, nous 
pouvons répéter ce qui a été dit pour les courbes T. Si l’on choisit des points 
de division au hasard, B, est indéterminé entre deux limites positives; mais 
le fait que, pour les lignes L’n, on ait B, = 2, suffit 1 montrer que la courbe 
fait juste assez de détours infiniment petits pour pouvoir remplir une aire, 
si son tracé était guidé par d’autres lois que celles du hasard. 

Evaluons d’abord l’aire 8’n. Un triangle de cette aire, si son hypoténuse 
est Al, a pour aire (Al)? sin ®, ¢ étant un angle choisi au hasard entre 0 et 7; 
sa valeur probable, sans tenir compte des signes, est donc (Al)?/2z, et, pour 
ensemble des triangles de S’,n, comme B, la somme de ces valeurs 
probables est 1/7. I] s’agit d’ailleurs d’une somme de termes tous trés petits; 
on vérifie aisément qu’il y a convergence en moyenne quadratique, et méme 
convergence presque siire, vers cette valeur probable. 

Pour les courbes I’, I’; et [’2, on se trouve alors dans les mémes ¢c0D- 
ditions que pour To, T, et T.: chaque aire 8’, étant une somme de triangles 
ayant la méme orientation, la série 3S’, qui définit S$ est asymptotiquement 


( 

q 

e 

u 

§ 

e 

fi 

es 

b 

$] 

a 

a 

p 

b 

a 

I 

SE 

d 


LE MOUVEMENT BROWNIEN PLAN. 549 


dela forme = + 1/7; elle est divergente, et l’aire S n’est pas stochastiquement 
définie; du moins, 4 premiére vue, elle ne semble pas l’étre. 

Pour la courbe I’;, au contraire, les aires des triangles de chaque S’n ont 
des signes variables. La valeur probable de S’n est nulle, et celle de S’,? est 


E{ (8'n)*} = sin? 6} == 3E((Al)*} 


< — €(Max(Al)*}. 
Cette expression est le terme général d’une série convergente. D’autre part, 
quand tous les sommets de la ligne L’, (et par suite S’,, S’2,-- +, S’n-1) sont 
connus, laire S’, dépend d’une loi symétrique. On sait que ces conditions 
entrainent a la fois la convergence en moyenne quadratique et la convergence 
presque stire de la série S = 3S’,; V’aire S est ainsi presque sfirement définie. 


6°. Démontrons maintenant que: les courbes IY ont presque siirement 
une mesure superficielle nulle. Jie raisonnement qui suit, en grande partie 
identique 4 ceux faits pour le mouvement brownien et pour la courbe I, 
vapplique indifféremment aux différents types de courbes I’. 

Les courbes I” étant dans une région bornée, leur mesure superficielle 
est bornée; elle a donc une valeur probable m, positive ou nulle, mais finie. 
Si A(0), A($) et A(1) sont connus, les valeurs probables des mesures super- 
ficielles des arcs A(0)A(3$) et A($)A(1) sont m cos? @ et m sin? a, a désignant 
Vangle A(1)A(0)A(4) ; leurs valeurs probables a priori, A(1) étant inconnu, 
sont done égales 4 m/2 (on remarque que, méme si l’on adoptait pour @ une 
loi de probabilité absolument quelconque, la somme de ces valeurs probables 
est toujours m). On en déduit, exactement comme dans le cas du mouvement 
brownien (§6,2°), que l’ensemble des points communs aux deux arcs con- 
sidérés a presque sfirement une mesure superficielle nulle; il en est de méme 
a fortiori, quel que soit 7 entre zéro et $, de ensemble des points communs 
aux ares A(}—7)A(4) et A(4)A($+17), qui n’est qu’une partie du 
précédent. 

Nous prendrons tr = 1/64 (pour la courbe I’; on pourrait prendre 1/16). 
On voit aisément que, si A(0), A(4) et A(1) sont connus, les positions possi- 
bles pour chacun des points A($—-r) et A(4+7) recouvrent une aire 
entourant complétement le point A(}), et cela avec une densité de probabilité 
admettant au voisinage de ce point une borne inférieure positive. 

Désignons par C, une quelconque des formes possibles de ’are A($)A(1). 
Les différents ares A(4)4(4-+7) possibles s’obtiennent en choisissant une 
position possible pour A(4 +7) et un are A($)A(4+7) sera 
semblable 4 C, allant de A(4) 4 A(4 +7). Supposant Vorigine placée au 
point A(4), et représentant les points du plan par leurs affixes X + iY, nous 
désignerons par U(t)(4<¢t<1) Vaffixe du point A(t) de Vare C; et par 


550 M. PAUL LEVY. 
celle du point A(t) de are A($)A($ +7) semblable 


a C; et aboutissant en un point A(4-+ 7) daffixe V;. On a évidemment 


U[(1+4)/2] 
U(1) 


V=V(j+ur) (0<u<1). 

Supposons que C’, ait une mesure superficielle positive; si V est donné et 
assez petit, et que l’on détermine V, par cette relation, les points d’affixes V, 
décrivent, quand w varie, une courbe transformée de C; ayant aussi une mesure 
superficielle positive, et de plus trés voisine de A(4); il y aura donc une 
probabilité positive que A($ + 7) soit sur cette courbe, et que par suite |’are 
A(4)A(4+ 7) contienne le point M daffixe V; c’est un point quelconque 
dans le voisinage de A(4). 

Si alors il y avait une probabilité positive que IY ait une mesure super- 
ficielle positive, il en serait de méme pour C,; la circonstance que nous venons 
d’examiner se produisant avec une probabilité positive, la probabilité ¢,(M) 
que, A(4) étant connu, M appartienne a l’are A(4)A(4-+7), serait positive 
pour M assez voisin de A(4) ; il en serait de méme de la probabilité (J) 
relative 4 lare A(}—+r)A(%). Or, une fois choisi, ces arcs sont 
indépendants, et la probabilité que Jf appartienne 4 la fois 4 ces deux ares 
aurait la valeur (I) = $)(M)¢:(M) positive au voisinage de A(4). Son 
intégrale dans tout le plan, qui est la mesure superficielle probable de l’en- 
semble commun a ces deux ares, quand A(4$) est connu, serait positive. Cette 
conclusion étant vraie quel que soit le point A(4), sauf s’il occupait une des 
positions extremes A(0) et A(1), ce qui est infiniment peu probable, la valeur 
probable a priori de cet ensemble serait nulle, ce qui est en contradiction avec 
le résultat obtenu plus haut. Le résultat énoncé est donc établi. 

Nous avons ainsi vérifie une fois de plus un fait évidemment trés général: 
quand Voscillation brownienne n’est pas infinie, pour que la courbe éludice 
recouvre une aire, tl faut une organisation de ses détours infiniment peltts que 
le hasard n’a aucune chance de produire. Le cas général est celui ow la mesure 
superficielle de la courbe considérée est nulle. 

Nous avons, dans les trois cas étudiés dans ce travail (courbes C, I; et 
I’) utilisé ’indépendance, au moins lorsque certains éléments aléatoires sont 
connus, d’un are précédant ce point et d’un are suivant ce point. I] est évident 
qu’une relation aussi précise que l’indépendance stochastique n’est pas néces- 
saire. Ainsi les probabilités ¢)(M) et ¢:(M) pourraient n’étre pas indé- 
pendantes; si ¢;(M/), quoique dépendant de l’are A(0)A(4) supposé connu, 
avait une borne inférieure positive, la conclusion subsisterait. 

Il y a done lieu de penser que le principe général que nous venons 
d’indiquer peut s’appliquer 4 beaucoup d’autres schémas que ceux étudiés dans 
ce travail. 


Paris. 


ais tte 


0 
i 
( 
i 


ble 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE 
FIELDS.* ? 


By REINHOLD Barr. 


N. Jacobson * has recently succeeded in extending the Galois Theory from 
commutative fields to non-commutative fields. In accordance with the now- 
adays customary point of view he considers the Galois Theory as the theory 
of finite groups of automorphisms of commutative fields and of their fields of 
invariants. This theory contains the classical correspondence theorem of 
Galois as a simple special case. His fundamental condition which makes it 
possible to carry the commutative theory over to the non-commutative case 
is the restriction to finite groups of automorphisms without inner auto- 
morphisms £1. His method consists in the application of the theory of 
simple rings without making much use of the commutative Galois Theory. 

In this paper we give a different approach to Jacobson’s theory. Our 
intent is to use the commutative Galois Theory ruthlessly and it turns out 
that beyond doing this one needs hardly more than the fact that a non-singular 
matrix with coefficients in a commutative field possesses an inverse; in par- 
ticular we do not need any deeper facts concerning linear transformations 
or ideals. 

Our method makes it possible to extend the theory in several directions. 
First we may investigate instead of non-commutative fields linear systems over 
commutative fields which need not have a finite basis over this field of reference 
and with one exception all the theorems of Galois Theory proper hold true in 
this framework. The exception which does not hold true may be easily derived 
in the case of fields from certain theorems concerning linear systems and is 
actually wrong for linear systems. Secondly we can prove that—at least for 
infinite linear systems—Jacobson’s condition, properly phrased, is necessary 
for the validity of a Galois Theory. The rephrasing consists in substituting 
the concept “ central-automorphism ” for “ inner automorphism”; and this is 
necessary, since only the former can be defined in the case of linear systems. 
This is the reason why we need the stronger hypothesis to obtain even those 
results which Jacobson is able to establish on the basis of his weaker con- 
dition. Thirdly we may prove quite general theorems which permit the 


* Received August 4, 1939. 
* Presented to the American Mathematical Society, September 1939. 
?N. Jacobson, Annals of Mathematics, vol. 41 (1940), pp. 1-7. 
551 


et | 

1 

Te 

rc 

e 

§ 
e 
) 
t 


t 


552 REINHOLD BAER. 


transfer of the Galois Theory of any class of groups of automorphisms of 
commutative fields to linear systems whenever there exists a commutative 
Galois Theory so that in particular the Galois Theory of infinite algebraic 
extensions may be extended to linear systems; and these general “ transfer ” 
theorems are actually the starting point of our investigations. 

Finally we give the elements of a theory of crossed products of non- 
commutative fields with finite groups of automorphisms. The standard 
theorems may easily be carried over. Only when proving a generalization of 
KE. Noether’s “ Hauptgeschlechtssatz im minimalen” do we have to assume 
that the field in question be finite over its central so that we may use the 
theorem that central-automorphisms are inner automorphisms, a theorem that 
otherwise has no place in our theory. In this chapter as in the others 
Jacobson’s work and ours overlap in many respects though the methods em- 
ployed are rather different—his being strictly non-commutative, ours strictly 
commutative—and though neither obtains all the results of the other one. 


CHAPTER I. Fundamentals and transfer. 


1. If the set Z is a commutative group with regard to an operation 
which is written as addition, if Z contains elements different from 0, if F 
is a commutative ® field, and if there exists to every element f in F, z in L 
a uniquely determined element fx = af in L so that 


(a) =fa-+ fy for f in F, in L, 
(b) (f+ g9)t—=fa-+ gz for f,g in F, z in L, 

(c) f(gr) = (fg) for f,g in F, in L, 

(d) la2=—~a for « in LZ and 1 the unit element in F, 


then L is called a linear system over the field F. 

Note that (d) assures the absence of zero-divisors. 

Dependence and independence of subsets of Z with regard to FY’ may be 
defined as usual. Every independent set is contained in a greatest independent 
set, every greatest independent set is a basis and any two greatest independent 
sets contain the same number of elements. These remarks clearly concern L 
as an abelian operator group with operators in F. However it is not this aspect 
of the matter that interests us primarily. 


It may be noted that in some parts of Chapter I it is not necessary to assume 
that F be a commutative field, that the property of being a field will be sufficient for 
these considerations. 


= 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 553 


If S is any subset of L, then we denote by 7(S) the set of all the elements 
z in F so that zs is in S for every s in S. The subset S of L is said to be 
complete in L, if 
(i) Sis a subgroup of the addition group L,, 
(ii) 
(iii) Z(S) ts a subfield of F. 

Most of the complete sets which we shall have to consider will satisfy an 
additional property : 
(iv) If f is an element in F so that there exists an element s ~0, satisfying: 


fs is an element in S, then f is an element in Z(8). 


If S is a subset of L, U a subset of F, then US is the subgroup of the 
additive group LZ, which is generated by all the elements us for wu in U and 


sin 8. 
If S is complete in ZL, then the subset V of F' is said to be independent 


over 8, if > vis; = 0 for »% in V and s; in S implies that all the s; are 0. 

(1.1) If S is complete in L, and tf the subset V of F is independent over S, 
then V is independent over Z(S). 

Proof. Suppose that = 0 for in V and z in Z(S8). There 

exists in S an element s=40. Hence all the elements s; = z;js are in S.. Thus 
we have ziv; = 0. Since V is independent over S, this implies 
4=1 i=1 


that 0 = s;— 2 Hence z; —0, since s 0. 
The converse of (1.1) is in general not true. Hence we define: 


The subset T of L is the direct product U * S of the subfield U of F 
by the subset S of L which is complete in L, if 


(1) Z(8) <0, 


(2) subsets of U are independent over Z(S) if, and only if, they are in- 
dependent over S, 
(3) T = US. 


It is a consequence of (1.1), that the conditions (2) and (3) may be 


condensed into the following condition: 
(0) A subset V of U is a basis of U over Z(S) if, and only if, it is a basis 


of 
ve 
ie 
d 
f 
e 
it 
y 


554 REINHOLD BAER. 


of T over S, i.e. every element t in T may be represented in one and only 
one way in the form: 


t= > s(v) v for s(v) in 8, 


v in 
where all the s(v)—apart from a finite number of exceptions—are 0. 


That 7 is generated in “adjoining” V to S is equivalent to (3) ; and 
the unicity of representation of elements in T is equivalent to (2). 

A simple method for constructing subsets T of Z so that L is the direct 
product of F and T is contained in the following statement. 


THEOREM 1.2. Suppose that L is a linear system over the commutative 
field F, and that the subset T of L is complete in L. Then L is the direci 
product of F and T tf, and only if, every basis of the operator group T over 
Z(T) is a basis of the operator group L over F. 


Proof. Suppose first that Z is the direct product of F and T. Let B be 
some basis of the operator group 7 over Z(T') so that T may be written: 
T= > Z(T)b. Suppose furthermore that f,,- - -,f, are elements in F, 


bin B 


k 
+, b,x elements in B so that fibi = 0. Let A be some (linear) basis 
i=1 
of F over Z(T). Then there exist elements uw; in A, 2; in Z(T) so that 
m m k 
fi = and thus we find that 0 = 2;;u;b; = where = 
j=1 4,j j=1 q=1 


is an element in 7, since the b; are in T and the 2; are in Z(T'). Since Lis 
the direct product of F and 7’, and since the u; are elements in F which are 


k 
independent over Z(T’), it follows that 0 = t; = S 2:;b;. Since the elements 
i=1 


b; are part of a basis of the operator group T over 7(T'), they are independent 
over Z(T’) so that all the elements z;; are 0, since they are in 7(T'). Thus all 
the f; are 0, i.e. B is independent over F too. Since L = FT’, every element 
in Z depends on B (with coefficients in F') so that B is a basis of the operator 
group L over F. 

Suppose now conversely that B is some basis of the operator group 7’ over 
Z(T) which is at the same time a basis of the operator group L over F. Then 
clearly L = FT. Suppose now that the elements f,,- - -, fz in F are (linearly) 


k 
independent over Z(7'), and that the elements ¢; in T satisfy: 0=2 life. 
Since B is a basis of 7, there exist elements 2; in 7(7'), b; in B so that 


2ijb; and hence 
g=1 


| 


aly 


nd 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 5055 


m k 


~ aijbifi = [ bs. 
j=1 t=1 


k 
Since the elements 6; are part of a basis of I, over F, it follows that 0 = > zi;fi, 
i=1 


as these are elements in F’; and since the f; are independent elements in I 
over “4(7'), it follows that the elements 2;; in Z(7’) are 0. Thus all the ¢; 
are 0; and this implies that every set in F which is independent over Z(7) is 
at the same time independent over 7. Hence L is the direct product of F and 
7; and this completes the proof. As a matter of fact we have proved slightly 


more namely the 


CoroLLary 1.3. Jf L is a linear system over the commutative field F, 
and if the subset T of L is complete in L, then the following propositions are 
equivalent. 


(A) 


(B) There exists at least one basis of F over Z(T') which is a basis of L 
over T. 

(C) very basis of the operator group T over Z(T) is a basis of the operator 
group L over F. 

(D) There exists at least one basis of the operator group T over Z(T) which 
is @ basis of the operator group L over F. 


2. In this section we shall introduce the concept of automorphism of 
a linear system which, of course, will differ from the concept of automorphism 
of an operator group. 


(2.1) If Lisa linear system over the (commutative) field F, then there exists 
toa gwen automorphism g of the additive group L, at most one automorphism 
h of the field F so that 


= for f in F and in L. 


Proof. Suppose that h and k are two automorphisms of the field F so that 
fos = (fr)* — fees for f in F and x in L. There exists in LZ an element 
wA~0; and ~0, since g is an automorphism of L,. Hence = fkws 
or (f* — f*) ws — 0 and this implies f* = f* for every f in F orh=k. 


Consequently we define: The transformation g of L is an automorphism * 
of the linear system L over the field F, if 


‘These automorphisms of linear systems over fields are often termed semi-linear 
transformations ; ep. e.g. N. Jacobson, Annals of Mathematics, vol. 38 (1937), 484-507. 


ect 
we 
act 
er 
be 
n: 
‘is 
at 
i 
is 
it 
I] 
it 
) 


556 REINHOLD BAER. 


(*) gts an automorphism of the additive group L,, and if 


(**) there exists one (and only one) automorphism h of the field F so that 
(fx)* = for f in F and in L. 


If g is an automorphism of the linear system ZL over the field F’, then we 
say of the uniquely determined automorphism h of the field F which occurs 
in (**) that it is induced by g and put hg’. If G is some set of auto- 
morphisms of L, then we denote by G’ the set of all the g’ for g in G. 

If u,v are both automorphisms of L, then (uv)’ = w’v’. If is a group 
of automorphisms of LZ, then a homomorphism of G upon the group G@’ is 
defined in mapping the element g in G upon the element g’ in G’; and this 
homomorphism of G upon G’ is an isomorphism between G and G’ if, and 
only if, g’ =1 implies g = 1 (for elements g in G). 

An automorphism which leaves all the elements in some set Y invariant 
is called a Y-automorphism. If the subset S of the linear system L over the 
field F is complete in L, and if g is an S-automorphism of ZL, then g’ is a 
Z(S)-automorphism of L. 

If G is a group of automorphisms of L, then (L,G) consists of all the 
elements in 1 which are left invariant by every automorphism in G; and 
(F,H) is defined accordingly. 

If 8 is a subset of L, then (S < L) is the group of all the S-auto- 
morphisms of LZ; and FF) is defined accordingly. ‘These operations 


satisfy (as usual) : 
Gs ((1,6) <L) and SS(L,(S<L)). 


Since furthermore G < H implies (L,H) = (L,G), and since S = T implies 
(T<L)=(S <L), it follows that 


(S<L)—((L,(8<L)) <L) and (L,G) =(L, ((L,6) <L)). 


(2.2) Let G be a group of automorphisms of the linear system L over the 
commutative field F. 

(a) If every (F,G’)-automorphism of F is induced by at most one (L,G)- 
automorphism of L, then (L,G) ~0. 

(b) If (L,G) then 

(b.1) 4Z((L,6)) = (F,6’, 

(b.2) (L,G) ts complete in L, 

(b.3) the element f in F belongs to Z((L,G)) whenever there exists an 
element t 40 in (L,G) so that ft is in (L,G). 


| 

| 

| 

( 

( 

( 

\ 

é 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 557 


Proof. Assume that every (F,G’)-automorphism of F is induced by at 
most one (L,G)-automorphism of LZ and that (L,G) =0. If f is an element 
#0 in F, then an automorphism of L is defined by 7% = fa for x in L, since 
(rc)®& = fre = rae for r in F. Hence g is an (L,G@)-automorphism of ZL 
satisfying g’ — 1, and this implies g = 1 so that f —1, i.e. F consists of 0 
and 1 only. Consequently every g’—1 so that every g—1; and hence 
(L,G) = LS 0 which is a contradiction. 

Assume now that (1,G) 0, and that f is an element in F, t~0 an 
element in (L,G@) = T and that ft is in T too. If g is any automorphism 
in G, then ft = (ft)* = fe’t so that f = f*’ for every g’ in G’. Hence f is an 
element in (#,G’) and in particular 7(7) S (F,G’).—If z is an element 
in (F,G’), ¢ any element in 7, then (zt)* = zt for every g in G so that zt 
isin T and consequently z is in Z(7’). Hence Z(T) = (F,G’) so that Z7(T) 
is a field and 7 is complete in L. 


THEOREM 2.3. Suppose that L is a linear system over the commutative 
field I’, and that the subset T of L ts complete in L. 


(a) If L is the direct product of F and T, then every Z(T)-aulomorphism 
of F is induced by one and only one T-automorphism of L. 

(b) L=FT if, and only if, the identity is the only T-automorphism of L 
which induces the indentity in F. 

(c) If Z(T) = (Ff, (T < L)’), then every independent subset of the opera- 
tor group T over Z(T) is independent over F too. 


Proof. If L is the direct product of / and T, and if B is a basis of 
over Z(7'), then B is a basis of L over T. If w is any element in L, then 
there exist therefore uniquely determined elements t(z,b) in 7 so that only 


a finite number of ¢(z, b) are different from 0, and so that «= l(a, b)b. If 
bin B 


g and h are two 7’-automorphisms of so that g’ =h’, thena? = > (a,b) be’ 


= > so that If conversely u is a Z(T')-automorphism 
bin B 


of F, then a 7'-automorphism v of FL is defined by a= & t(x,b)b" and 


bin B 
clearly v’ =u. This proves (a). 

L* = FT is in any case an admissible subgroup of the operator group L 
over F, Thus every basis of the operator group L* over F is contained in 
some basis B of L over F. If L* < L, then there exists in B an element w 
which is not contained in L*. As 7 ~ 0, there exists in 7 an element t 40; 
and there exists one and only one automorphism g so that g’ = 1, b = bs for 
bAw in B, we—w-+t. Since g is a 7'-automorphism, and since g ~1, 


558 REINHOLD BAER. 


it follows that FT’ ~ L implies the existence of a T-automorphism ~ 1 which 
induces the identity in Ff’; and this proves (b), since g is the identity on 7, 
if g is a T-automorphism such that g’ = 1. 

Suppose now that 7(7T) = (F, (T < L)’) and that S is a subset of 7 
which is independent over 7(7’). If S would not be independent over F, 
then S would contain a finite subset which is dependent over F, and amongst 
these there would be a smallest one, say s,,° °°, s%,. There exist therefore 


k 
elements f;, not all of them 0, so that 0 = >) sif; and since the s; form a 


smallest dependent subset of S, none of the f; is 0. If g is any T-auto- 


k 
morphism of L, then 0 = > sifi*’ and consequently 0 = —fi* 


ia 
and this implies fif,* = fi*’f,, since the s; form a smallest [over F'] dependent 


subset of 8. Hence fif,- is invariant under all the g’ in (T < L)’ and belongs 
k 


therefore to Z7(T). Since f, ~0, we find therefore that 0 = > 


i=1 
where all the coefficients are elements 40 in Z(T). Hence the s; and there- 
fore S would he dependent over 7(7') and this is impossible so that (c) holds 


true too. 


3. The problem of a Galois Theory of linear systems will be reduced 
by means of the theorems in this section to the corresponding problems of 


Jalois Theory in commutative fields. 

THEOREM 3.1. The subset T of the linear system L over the commutative 
field F satisfies 
(a) T=(L,(T < L)) and 
(b) lis the only automorphism g in (T < L) so that g’ = 1, 
if, and only if, 

(i) T is complete in L, 
(ii) = (F,(4(T) < F)), 
(iii) ZL is the direct product of F and T. 

Proof. Suppose first that the conditions (a) and (b) are satisfied by 1. 
Then it follows from (2.2) that 7’ is complete in I (condition (i) !) and 
that Z7(T) = (F,(T <L)’). Since therefore (T < L)’S (Z(T) < F), it 
follows that 

“(T) = (Ff, (4(T) < F)) S (Ff, (7 < L)’) =Z(T) 


and this proves (ii). 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 559 


Suppose now that B is a basis of the operator group 7’ over the field 7(T). 
Then it follows from Theorem 2.3, (c) that B is independent over F’ so that 
B is a basis of the operator group ZL over F, since it follows from (b) and 
Theorem 2.3, (b) that 1 FT. Hence L is the direct product of F and T 
by Theorem 1. 2. 

Suppose now conversely that the conditions (i) to (iii) are satisfied by 7’. 
Then it is a consequence of Theorem 2.3, (a) that (b) holds true, and that 
(Z(T) < F) = (1 <L)’. Let now B be a basis of the operator group T 
over Z(7'). Then B is by Theorem 1.2 and condition (iii) a basis of the 
operator group L over Ff’. If w is any element in (L, (7 < L)), then there 


k 
exist elements f; in / and elements b; in B so that w= > fidbi. If v is any 
i=1 


automorphism in (Z(7') < F), then there exists a 7'-automorphism g of L 


k 
so that g’ =v; and hence we find that = ws = w = fibi so that 
i=1 4=1 
fi=fi? for every v in (Z(T) < F). All the elements f; are therefore in 
(F,(7(T) < F)); and since this field is equal to Z(7T) by condition (ii), 
it follows that all the f; are in Z(7') and that w is in 7’, since all the b; are 
in T. Hence (L,(7 < L)) ST, i.e. (a) holds true too. 

THEOREM 3.2. The group G of automorphisms of the linear system L 
over the commutative field F satisfies 
(a) G=((L,G6) <L) and 
(b) 1 is the only automorphism g in G so that g’ =1 
if, and only if, 

(i) G’=((F,G’) < F) and 
(ii) every (F,G’)-automorphism of F is induced by one and only one 
(L,G)-automorphism of L. 

Proof. Suppose first that (a) and (b) are satisfied by G. Put T = (L,G) 
so that G = (T < L) and T = (L,(T < L)) by (a). Hence it follows from 
(b) and Theorem 3.1 that T is complete in L, that 7(T) = (F, (Z2(T) < F)) 
and that ZL is the direct product of F and 7. Now it is a consequence of 
Theorem 2. 3 that every Z7(7')-automorphism of /’ is induced by one and only 
one T-automorphism of L and hence (ii) holds true. since 

“(T) = (F,(4(T) < F)) = (F, £)’) 
by (2.2). Finally it follows now from (a) that 
G’ = ((L,6) <L)' = (T < =((F,C) < F) 


so that (i) holds true too. 


560 REINHOLD BAER. 


Assume conversely that G satisfies (i) and (ii). If the automorphism w 
is in ((L,G) < L), then w’ is in (Z7((L,G)) < F); and since Z((L,G)) 
= (I,G’) by (ii) and (2.2), w’ is in ((F,G’) < F) which group equals 6’ 
by (i). Hence it follows from (ii) that w is in G so that (a) and (b) are 
consequences of (i) and (ii). 


CoroLiary 3.3. Suppose that the group G of automorphisms of the 
linear system L over the commutative field F satisfies the conditions (a) and 
(b) of Theorem 3.2, and that H is a subgroup of G. Then H = ((L,H) <L) 
if, and only if, H’ = ((F,H’) < Ff). 

This follows from Theorem 3.2, since H satisfies condition (b) of 
Theorem 3.2 as a subgroup of G, and since (f,G’) S (F,H’) implies that 
H satisfies condition (ii) of Theorem 3.2 as G satisfies this condition. 


THEOREM 3.4, Suppose that L is a linear system over the field F, that T 
is a subset of L, that T = (L, (T < L)), that 1 is the only T-automorphism g 
of L with g’ =1, and that B is a set between T and L. 


(A) B=(L,(B < L)) tf, and only if, there exists a fiell R between Z(T) 
and F so that R= (F,(R <F)) and so that B=RT. 

(B) If & is a field between Z(T) and F so that R= (F, (Rk < F)), then 
R=Z(RT) and RT is the direct product of R and T. 


Proof. Suppose first that there exists a field R between Z(7) and F 
so that = (Rk < F)) and so that B=RT. Then R= Z(B). Let z be 
an element in 7(B), ¢ an element ~0 in 7 and v an k-automorphism of /’. 
Then there exists by Theorem 3.1 and 2.3 a T-automorphism g of L so that 
g’=v. Since g leaves all the elements in 7 invariant, and since v leaves all 
the elements in ? invariant, g leaves all the elements in RT invariant. Since 
T S RT, and sincez is inZ(RT), zt is anelement in RT so that zt =(2t)*& = 71 
and z= since t A 0. Hence z isin (F, (Rk < F)) =F so that R=Z(Th). 
Hence (B) holds true, since subsets of 2 < F that are independent over Z(7') 
are independent over 7’. 

Let now S be a basis of the operator group T over Z(T). It follows from 
Theorem 3.1 and Theorem 1.2 that S is a basis of the operator group L over 
F. If w is an element in (L, (RT < L)), then there exist elements f; in F 


and elements s; in § so that w= > fis;. If v is any R-automorphism of F, 
i=1 


then there exists again a T-automorphism g of L so that g’ = v; and g is an 


k k 
RT-automorphism of L. Hence > fis; = w = wt = D> fis; so that fi = fi’, 


i=1 


sl] 

fo 

B 
eX 

id 

th 
C 

in 

It 

m 

( 

( 
th 

m 

an 
(| 
im 

( 
if 

( 
( 

| 
| 
| 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 561 


since the s; are in 7’ and are independent over /’. The elements f; are there- 
fore in (F’, (Rk < F)) =F so that w is in RT, i.e. RT = (L, (RT < L)). 

Assume now conversely that the set B between 7 and L satisfies 
B=(L,(B<L)). There exists no B-automorphism of L, inducing 1 in F, 
except 1, since there exists no 7’-automorphism 1 of Z which induces the 
identity in #. Hence it follows from Theorem 3.1 that B is complete in ZL, 
that Z7(B) = (F, (7(B) < F)) and that L is the direct product of F and B. 
Consequently B* = Z(B)T = B; and it follows from what has been proved 
in the first two paragraphs that Z(B*) = Z(B) and that B* = (L, (B*¥ < L)). 
It is a consequence of Theorem 3.1 and Theorem 2.3 that every Z(7')-auto- 
morphism of /’ is induced by one and only one 7’-automorphism of LZ so that 
(B< L)’ = (Z4(B) < Ff) = (Z2(B*) < F) = (B* < L)’ and therefore 
(B< L) = (B* <L) and finally B* = (L, (B* < L)) = (L,(B<L)) 
= B and this completes the proof of (A). 


THEOREM 3.5. Suppose that T 1s a subset of the linear system L over 
the field F, that T=(L,(T <L)), that the identity 1s the only T-auto- 
morphism of L which induces the identity in F, and that the set B between T 
and Lis complete in L. Then B satisfies: 


(a) B= Bs for every T-automorphism g of L, 


(b) every T-automorphism of the linear system B over the field Z(B) 1s 
induced by some aulomorphism of L, 


(c) (Ff, (B < L)’) = Z(B) 
if, and only if, the following conditions are satisfied by B: 


(i) (B<L) is a normal subgroup of (T < L), 


(ii) every Z(T')-automorphism of Z(B) is induced by automorphisms of F, 
(iii) B= (L,(B<L)). 


Proof. We note first that (Be < L) =g1(B < L)g for every auto- 
morphism g of Z. This shows that (i) is a consequence of (a). If conversely 
(i) and (iii) are satisfied, then Bs < (L, (Be < L)) = (L,(B< L)) =B 
for every 7'-automorphism g of L. This implies that B< Bs* for every 
T-automorphism g of L and therefore we have B= Be for every 7'-auto- 
morphism g of JL, i. e. 


562 REINHOLD BAER. 


(a) is a consequence of (i) and (iii). 

If (iii) is true, then it follows from Theorem 3.4 that Z(B) 
= (F,(Z(B) < F)) and that B=Z(B) XT. It follows from Theorem 
2.3 that every Z(7’)-automorphism of 7(B) is induced by one and only one 
T-automorphism of B. If now g is any T-automorphism of B, then g’ is a 
Z(T')-automorphism of 7(B). If (ii) holds true, then there exists an auto- 
morphism u of F which induces g’ in Z(B). There exists by Theorem 2.3 
one and only one 7-automorphism h of LZ so that h’ =u. It is a consequence 
of (a) that h induces an automorphism k in B. Since clearly k’ = g’, it 
follows that k = g, as every 7(7T)-automorphism of Z(B) is induced by one 
and only T-automorphism of B. Thus (b) is a consequence of (i) to (iii). 

Suppose now that wu is a Z(B)-automorphism of F. As wu is a 7(T)- 
automorphism of F, there exists one and only one T'-automorphism v of L 
so that v’ =u. Since B=Z(B) X T, it follows that v is a B-automorphism 
of LZ, and this shows that (7(B) < F)=(B<L)’. Since we already 
proved that 7(B) = (F, (Z(B) < F)), condition (c) is a consequence of 
(i) to (ili). 

We assume now that conditions (a) to (c) are satisfied by B. Let S be 
any basis of the operator group 7 over Z(T). Then S is a basis of the 
operator group ZL over /’ so that S is independent over 7(B). Hence S is 
contained in a basis S* of the operator group B over 7(B). But it follows 
from (c) and. Theorem 2.3, (c) that S* is independent over F' too. Con- 
sequently S = S* and B is the direct product of Z(B) and T, as follows 


from Theorem 1. 2. 


Since (B < L)’ = (Z(B) < F), and since therefore 
Z(B) < (F, (Z4(B) <F)) (F, (B< =4(B) 


by (c) or Z(B) = (F, (Z(B) < F)), it follows now from Theorem 3. 4 that 
B= (L,(B<TL)). 

Suppose finally that uw is a 7(7')-automorphism of Z(B). Since B 1 
the direct product of Z(B) and T, there exists by Theorem 2.3 one and only 
one T-automorphism v of B so that v’ =u. It is a consequence of (b) that 
there exists an automorphism g of Z which induces v in B. Then the auto- 
morphism g’ of F’ induces v’ =u in Z(B). This completes the proof of the 
fact that (i) to (ili) are consequences of (a) to (c). 


(B) 
rem 
one 
is a 
uto- 
2.3 
nee 
, it 


one 


at 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 563 


CHAPTER II. Galois Theory. 


4, In this section we state the finite, commutative Galois Theory in the 


form most convenient for our purposes.® 


(4.1) Suppose that K is a subfield of the (commutative) field F. Then 
there exists a finite group H of automorphisms of the field F' so that 


K = 
if, and only if, ¥ is finite, normal and separable over K. 


(4.2) If HM is a finite group of automorphisms of the (commutative) field 


F, then 
H = ((F,H) <F). 


(4.3) If F is finite, normal and separable over its subfield K, then 
B= (F, (B < F)) 


for every field B between K and F, i.e. F is finite, normal and separable over 
every field B between K and F. 


(4.4) If F is finite, normal and separable over its subfield K, and if B is a 
field between K and F’, then a necessary and sufficient condition for B to be 


finite, normal and separable over K is that 
(B < IF’) is a normal subgroup of (K < F), 
and then (K < B) and (K < F)/(B < F) are essentially the same. 


(4.5) If F is finite, normal and separable over its subfield K, then there 
exists in # an element b so that the elements b* for h in (K < F) form a 
basis of F over K. (Existence of a normal basis).° 

it may finally be mentioned that finite and separable extensions are 


5 Apart from the text-books on modern algebra one should consult the following 
papers in which the theory has been presented in a form similar to the one sketched 
here. R. Baer, Mathematische Zeitschrift, vol. 33 (1931), pp. 451-479; R. Baer, Ameri- 
can Journal of Mathematics, vol. 59 (1937), pp. 869-888; W. Krull, Mathematische 
Annalen, vol. 100 (1929), pp. 687-698; E. Steinitz, Algebraische Theorie der Kérper 
Neu herausgegeben und mit einem Anhang: Abriss der Galoisschen Theorie versehen 
von Reinhold Baer und Helmut Hasse. Berlin, 1930. 

*A complete proof of this theorem has first been given by M. Deuring, Mathe- 
matische Annalen. All the proofs published so far use extensively the theory of 
representations. There exists however an unpublished proof by E. Artin which uses 
but elementary means from the theory of fields so that this theorem may now be con- 


sidered a part of Galois Theory proper. 


|) 

L 

sm 

of 

he 

VS 

VS 

18 

it 


564 REINHOLD BAER. 


simple extensions, and that the degree of a finite, normal and separable 
extension is exactly the order of its group, and that the matrix (bj) possesses 
an inverse, if the b; form a basis, the g the group of a finite, normal and 
separable extension ; finally that every automorphism of a subfield of a normal 
extension is induced by an automorphism of the extension field. 


5. In this section some remarks, concerning matrices and linear equa- 
tions, shall be given which will prove useful in the future. 

Let L be a linear system over the field FY. If B is a matrix of n rows 
and 2 columns with coefficients in /’, and if X is a matrix of n rows and one 


n 
column with coefficients in L, then BX (= (biz) (2j)) = (3S is a 
k=1 
matrix of n rows and one column with coefficients in L. 
If A and B are two matrices of nm rows and n columns, both with coeffi- 
cients in F’, and if XY is a matrix of n rows and one column with coefficients 
in L, then one verifies readily that 


A(BX) = (AB)X. 


If in particular # is the unit-matrix in F, then LX = X. 
It is now possible to write the system of linear equations 


n 
(+) Dd = cy for i= 1,---,n, Dy in F, in L, 
k=l 


in the matrix form: (bix) (a) = (ci). The solutions x, of (+-) should be 
looked for in Z. If in particular the matrix (bi) = B is non-singular, i.e. 
if the determinant of B is different from 0, then there exists the inverse 
matrix B-! to B; and the system (+-) of linear equations has one and only 
one system of solutions a, in L, since (biz) (%~) = (ci) if, and only if, 


B* = B* (bix) = (ax) = (Xx). 


6. Since the Galois Theory of finite groups of automorphisms is fully 
developed, it is possible to derive stronger theorems in the case of finite groups 
of automorphisms than the theorems of section 8. 


THeEoreM 6.1. Let T be a subset of the linear system L over the com- 
mutative field F. Then there exists a finite group G of automorphisms of 
so that 
(1) the identity is the only automorphism in G which induces the identily 
in F, 

(2) T = (L,G) 
if, and only ‘tf, 


/ 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 565 


(i) T ts complete in L, 
(ii) L 1s the direct product of F and T, 
(iii) F is finite, normal and separable over Z(T). 


Proof. Assume first that the finite group G of automorphisms of L 
satisfies the conditions (1) and (2), and that 7 = (Z,G). Then G and G’ 
are isomorphic finite groups. Hence I is finite, normal and separable over 
(F,G’) by (4.1). 

Now let b,,- - -,0n be a basis of F over (F,G’). Then G and G’ both 
contain n elements; and there exists an inverse M to the matrix (0,8) where 
the row-index g’ runs over all the elements in G’. If wu is any element in ZL, 
then the system 
(+) bit for g nG 

=1 
of linear equations possesses one and only one system of solutions a in L, 
namely—in matrix-notation—(a;) = M(w*). If h is any automorphism in 


G, then satisfies 


> b,e’h’a, = usk for g in G. 

Since gh runs over all the elements in the group G when g takes all the values 
in G, it follows that the 2; are another solution of (+); and since (+) 
possesses but one solution it follows that a; = .a;" for every h in G so that 
the x; are actually elements in T = (L,G). This implies in particular that 
L = FT .—If one applies this result concerning (++) on u = 0, then it follows 
that the b; are independent over 7’, since } bjt; = 0 with ¢; in T implies that 

i=1 


the equations > b,*’t; = 0 for g in G are satisfied, and since the only solutions 
4=1 


of these equations are ¢; = 0. 

Since L = FT, T ~0; and it follows from (2.2) that Z(T) = (F,C’), . 
that 7’ is complete in L, and therefore from the results of the first paragraph 
of the proof that ZL is the direct product of F and T. The conditions (i) 
to (iii) are therefore satisfied by 7’. 

Assume conversely that the conditions (i) to (iii) are satisfied by T. 
Then it follows from (4.3) that = (Z(T) < F)) and it follows 
therefore from Theorem 3.1 that (7'< L) satisfies condition (1) and that 
T=(L,(T <L)). Since (7 <L) satisfies (1), <L) and (T 
are isomorphic groups, so that (7' < L) is finite, since F is finite over Z(T), 
and since (7 < L)’ is a subgroup of (7(7}) < F). Thus the existence of a 


Seg 
nd 

vs 
ne 

a 

ts 

n 

e, 

n 


566 REINHOLD BAER. 


finite group G of automorphisms of L, satisfying (1) and (2), is a con- 
sequence of (i) to (iii). 

An alternative proof for this last inference may be given, as this second 
proof does not use Theorem 3.1. If (i) to (iii) are satisfied by 7, then it 
follows from (4.3) that 7(T) = (Ff, (Z7(T) < F)). Let 6 be an element 
in F’ so that the elements 6* for g in (Z(7') < F) form a basis of F over Z(T) 
(cp. (4.5)!). The elements b* form a basis of LZ over T—by (ii)—and it 
follows from Theorem 2.3 that (7 < L) satisfies (1) and that (Z(T) < F) 
= (7 < L)’ so that (T < L) =G is finite. If finally z is an element in 
(L, (7 < L)), then there exist elements t(g) in T so that r= t(g)be. 


ginG 
Consequently « = > t(g)bs’*’ for every h in G; and this implies that all the 
ginG 
t(g) are equal to a fixed element ¢ in T so that ct > be’ = fz for z in 


ginG 


Z(T). Hence ¢ is in T and consequently T = (L, (7 < L)). 


CoroLuary 6.2. Suppose that the subset T of the linear system L over 
the commutative field F is complete in L, and that F is finite, normal and 


separable over Z(T). 


(a) If L is the direct product of F and T, then (T < L) is a finite group 
of automorphisms of L, the identity is the only T-automorphism of L which 
induces the identity in F and T = (L,(T < L)). 


(b) JL is the direct product of F and T if, and only if, every Z(T)-auto- 
morphism of F is induced by one and only one T-automorphism of L. 


Proof. (a) has already been verified in the proof of Theorem 6. 1.—That 
the condition of (b) is necessary, follows from Theorem 2.3. If on the other 
hand every Z4(7')-automorphism of /’ is induced by one and only one T-auto- 
morphism of F, then (7 < L)’ = (Z(T) < F) and therefore Z(T) 
= (F,(Z(T) < F)) =(F,(T < L)’) by (4.3). Hence it follows from 
Theorem 2.3, (b), (c) that Z is the direct product of F and T. 


THEOREM 6.3. If the identity is the only automorphism in the finite 
group G of automorphisms of the linear system L over the commutative field 
F which induces the identity in F, then G = ((L,G) < L). 


Proof. It is a consequence of Theorem 6.1 that (L,G@) is complete in 
L, that F is finite, normal and separable over Z((L,G)) and that L is the 
direct product of F and (Z,G). Hence it follows from Corollary 6.2, (b) 
that every automorphism in (Z7((L,G)) <F) is induced by an (L,6)- 
automorphism of L so that ((L,G) < L)’=(Z((L,G)) <F), and it is 4 


Or 


| 
i 
( 
n 
d 
( 
t 
fi 
al 
B 
li 
fc 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 567 


consequence of (2.2) that 7((L,G)) = (F,G’). Now it follows from (4. 2) 
that G’ = ((F,G’) < F) = (Z((L,G)) < F) = ((L,6) < L)’. Since 
G’ is finite, and since every Z((L,G))-automorphism of F’ is induced by one 
and only one (L, G)-automorphism of L, this implies that G = ((L,G) < L). 


THEOREM 6.4. Suppose that L is a linear system over the commutative 
field F, that the subset T of L is complete in L, that L is the direct product 
of F and T, that F is finite, normal and separable over Z(T), and that B is 
a set between T and L. 

(A) B=(L, (B< L)) tf, and only if, there exists a field between Z(T) 
and F so that B= RT. 


(B) If Risa field between Z(T) and F, then R=Z(RT). 


This is a consequence of Corollary 6.2 and of Theorem 3. 4, since every 
field R between 7(7’) and F satisfies R = (F,(R <I)) by (4.3). 


THEOREM 6.5. Suppose that L, Ff, T and B satisfy the hypotheses of 
Theorem 6. 4, and that B is complete in L. Then B satisfies 


(a) (B<L) is a normal subgroup of (T < L), and 


(b) B= (L,(B<L)) 
if, and only if, 


(i) B= Bs for every T-automorphism g of L, and 
(ii) (F, (B< =Z(B). 


Proof. Every Z(T)-automorphism of Z(B) is induced by some auto- 
morphism of F, since Z(B) is between Z(7') and F, and since F is finite and 
normal over Z(7'). Thus the above conditions (a) and (b) imply the con- 
ditions (i) to (iii) of Theorem 3.5 and consequently the above conditions 
(i) and (ii).—If conversely the above conditions (i) and (ii) are satisfied, 
then (a) is a consequence of (B& << L) =g3(B< L)g. Since < L) is 
finite, (B < L) and (B < L)’ are both finite, and hence it follows from (11) 
and (4,2) that 

(B< < (4(B) <F) =((F, (B<L)’) < F) = (B< Ly’ 
or 
(B<L)’= (4(B) <F). 
By (i) every 7-automorphism g of L induces a 7'-automorphism g* in the 
linear system B over 7(B). If g*’ = 1, then g’ is in (7(B) < F) and there- 
fore in (B < L)’ so that g is in (B< L), i.e. g*=1. The group G* of 


= 


568 REINHOLD BAER. 


these automorphisms g* is finite, satisfies condition (1) and (2) of Theorem 

6.1, since (B,G*) = (L, (T < L)) =T. Hence it follows from Theorem 

6.1 that B = Z(B)T, and it follows from Theorem 6.4 that B = (L, (B < L)). 
The following theorem is some sort of a converse to Theorem 6. 3. 


THEOREM 6.6. Suppose that the linear system L over the commutative 
field F contains an infinity of elements, that G is a finite group of auto- 
morphisms of L and that (L,G) A0. Then G = ((L,G) < L) tf, and only 
if, the identity is the only automorphism in G which induces the identity in F. 


Proof. The sufficiency of the condition is a consequence of Theorem 
6. 3.—Suppose now that the identity is not the only automorphism in G which 
induces the identity in F. Then denote by W the set of all the elements w in 
G so that w’=1. Clearly W is a normal subgroup of G. Let V = (L,W). 
Then Z(V) == F and the automorphisms in G induce in V a finite group G* 
of automorphisms of the linear system V over F. This group G* is essentially 
the same as G/W. Since by the construction of V the identity is the only 
automorphism in G* which induces the identity in V, it follows from Theorem 
6.3 that G* = ((V,G*) < V) and it may be noted that (V,G*) = (L,6) 
=T,V FT. 

Since W ~£1,V < L. Since V is an admissible subgroup of the operator 
group L over F, there exists a basis B of Z over F which contains a basis U 
of V over F. Clearly U < B. Now we distinguish two cases. 


Case 1. V contains an infinity of elements. Let d be some element in B 
that is not contained in U. If v is an element 0 in V, then an auto- 
morphism g = g(v) is defined by the conditions: g’ = 1, 
for bAd in B. Each g(v) is a V-automorphism and therefore a 7'-auto- 
morphism of LZ. Since there exists an infinity of automorphisms g(v), it 
follows that (7' < L) is infinite so that the finite group G is certainly smaller 
than ((L,G) < L). 


Case 2. V contains but a finite number of elements. Then there exists 
in B an infinity of elements d which are not contained in V and therefore not 
in U, since it follows from the finiteness of V and from V 40, that the field 
F contains but a finite number of elements. Let v be some element ~ 0 in V. 
If d is an element in B that is not contained in U, then an automorphism 
h = h(d) is defined by the conditions: h’ = 1, d* =d-+v, b =b* for b #4 
in B. All these automorphisms are different. They are V-automorphisms and 
therefore they are T-automorphisms of 1. Consequently (7 < L) is infinite 
and therefore different from the finite group G. Hence we have proved that 


G 
t 
L 
th 
Tl 
th 
G 
sat 
(a 
(b 
(c 
the 
op 
it 
gre 
eX] 
F 
in 
80 
is 
t 
fro 
ind 
in 
in 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 569 


G = ((L,G) < L) implies that the identity is the only automorphism in G 
that induces the identity in F’, provided L is infinite. 


Remark. If L is a finite system, then every group of automorphisms of 
L is finite. In this case—using the notations of the proof of the preceding 


theorem—G = ((L,G) < L) if, and only if, W=((L,W) <L). 


7. In this section a short discussion is given of possibilities of extending 
Theorem 6.4 and Theorem 6.5. The following assumptions will be made 
throughout this section. J is a linear system over the commutative field I’; 
G is a finite group of automorphisms of ZL so that the automorphism g in G 
satisfies g’ = 1 if, and only if, g—=1; T= (L,G). Then we prove: 


There exists a set W between T and L so that 


(a) 
(b) =Z(W), 


(c) W is complete in L and Z(W) contains every element f in F to which 
there exists an element w 40 in W so that fw is in W 


if, and only if, the order of G is greater than 2 and the rank of the 
operator group L over F is greater than 1. 


Proof. Let B be any basis of the operator group 7 over Z(7’). Then 
it is a consequence of Theorems 1. 2 and 6.1 that B is a basis of the operator 
group L over Ff. It is a consequence of Theorem 6.1 and of (4.5) that there 
exists an element g in #’ so that the n elements qg®’ for g in G form a basis of 
F over Z(T') ; and these elements form by Theorem 6.1 a basis of L over 7’. 

Suppose now that the above conditions are satisfied. Then there exist 
in B two different elements b, and b» and there exists in G an element v~1 
so that is not inZ(7’). That this is possible is clear since =q > 


is an element 0 in Z(7’) and q is not in Z(7’). The elements 1, v do not 
exhaust G. Put w = qb, + qb. and denote by W the set of all the elements: 
t+ zw for t in T and z in Z(T). If g is an element in G which is different 
from 1, then we = g*’b, + q”#’b, Aw, since q~q*, and since b,, bz are 
independent over F. Hence T< WSL. 

It is obvious that W contains the sums of any two of its elements and that 
4(T) <Z(W). Suppose now that zw =r and that f be an element 
in F so that fr is in W. Then there exists an element s in 7’ and an element h 
in Z(T) so that 


8 


570 REINHOLD BAER. 
s+ hqb, + = fr = ft + feqb, + 


fl —s— (h— fe) qb: + (h— fa) 
Suppose now that f is not in Z(T). 


Case 1. ft—-s=0. Then it follows from the properties of 7 and from 
the fact that both s and ¢ are in T that t=0. Hence r—zw and r<0 
implies z~ 0. Consequently h — fz 0 and therefore gb, + g’b2=—0. But 
this is impossible. 


Case 2. Then h—fzA~0. Suppose first that Then 
we find for any element, g in G: 


(h — fez) + = (h — fz) (qb: + 
since s isin 7’ andh,zarein Z(T). Then 
(h — (h—fz)q and (h—fez)qre = (h— fz)q” 


and from h—fz~0, it follows that h fe’z (h—fz)® so that 
qq’ = q"*q or gi” = (q'’)* for every g’ in G’. Hence g!-” is an element 
in Z(7') and this is impossible by our choice of v, so that t 0. 
Every element x in T has the form: x — 2: #(2; b)b for z(z,b) in Z(T). 
in 


Then it follows from (—) that 


fz(t, b) —2z(s,b) =0 for 
fa(t, —2(s, (h — fz)q, 
fz(t, bz) —2(s, b2) = (h—faz)q”, 


since the elements b in B form a basis of the operator group L over I’. Since 
f is not in 7(T), we find z(t, b) =z(s,b) =0 for Since t £0, this 
implies that not both z(t, b;) are 0. Eliminating f from the remaining two 
equations we find that 


flat, b1) + 2q | = 2(s,b,) + hq, 

f[z(t, be) + ] = 2(s, be) + hg” and therefore 

q[z 2(s, bs) —h a(t, + z(t, b,) —z2(s, bi) | 
= 2(t, b,)z(s, b2) + z(t, b2)2(s, 


Since the right side of this last equation is invariant under all the g’ in ©’, 
and since there are g’ in G’ which are different from both 1 and v’, it follows 


now that 
be) =h2(t,be) and 22(s,b,) =h z(t, d,) ; 


ao 


( 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 571 


and consequently z 40, since not both z(¢,b;) are 0. Since all the z(t, b) 
and z(s,b) for bb; are 0, it follows that zs = ht and therefore we find 
from (—) that 


(h — fz)[qbi + 


or that —27*t = gb, + q’be, since h —fz0. But this is impossible, since 
zt belongs to 7’ and w = qb, + qb. does not. Hence it is impossible that 
f is not in Z(T) ; and this shows that W satisfies (a) to (c). 

Assume now conversely that W is a domain between T and ZL which satis- 
fies (b) and (c). To prove the necessity of our conditions we have to discuss 
two cases: 


Case I. The order of G is =2. Since there is nothing to prove, if 
G = 1, we suppose that G consists of two different elements 1 and g. If w 
is any element in W, then w= t,q + t.g®’ with t in T. Since 7 contains 
t=t,(q + q*’), W contains w—t = (t,—t,)q®. Since —1#, is in T, and 
since g® does not belong to Z(7'), it follows from (b), (c) that ¢; =, so 
that w = is an element in i.e. W=T. 


Case II. The rank of the operator group Z over F is 1. Then let ¢ be 
any element ~0 in 7. If w is an element in W, then w = ft for some f in F 
and it follows from (c), (b) that f is in 7(7’) so that w is in T, i.e. W=T. 


8. There exists a comparatively complete extension of the Galois Theory 
of finite groups of automorphisms (of commutative fields) to groups of auto- 
morphisms G which satisfy the condition: 

(F) The set of elements f# for g in G is finite for every f. The theory of 
these groups may be described as follows.’ 

(8.1) Let K be a subfield of the commutative field F. Then there exists a 
group G of automorphisms of F so that 

(1) the set f© (of all the elements f# for g in G) is finite for every ele- 
ment f in F, 

(2) K = (F,G) 


if, and only if, ’ is algebraic, normal and separable over K. 
(8.2)If F is algebraic, normal and separable over its subfield K, then 
(a) conditions (1), (2) of (8.1) are satisfied by (K < F); 


(b) FF is algebraic, normal and separable over every field between K and F; 


footnote *. 


572 REINHOLD BAER. 


(c) every K-automorphism of a field between K and F is induced by an 


automorphism of F; 


(d) every finite set of elements in F' is contained in a field between K and F 
that is finite, normal and separable over K. 


The groups G which satisfy condition (1) of (8.1) and G = ((F,G) < F) 
may be characterized by a certain closure property which we need not state, 
as we are not going to make any use of it. The following result is however 
of some importance for us. 

(8.3) Every subgroup S of the group G of automorphisms of F which has 
the property (1) of (8.1) satisfies S = ((F,S) < F) if, and only if G is 
finite. 


9. We shall now develop the Galois Theory of groups of automorphisms 
of linear systems which are subject to the above-mentioned condition (F) in 
analogy to the theories of sections 6 and 8. | 


THEOREM 9.1. The subset T of the linear system L over the commuta- 
live field F’ satisfies : 


(a) the identity is the only T-automorphism of L which induces the iden- 
tity in F, 


(b) a'?</) is a finite set for every element x in L, 


(c) T= (L,(T <L)) 
if, and only if, 


(i) T is complete wm L, 
(ii) F is algebraic, normal and separable over Z(T), 
(iii) JL is the direct product of F and T. 


Proof. Suppose first that the conditions (a) to (c) are satisfied by 7. 
Then it is a consequence of Theorem 3.1 that 7 is complete in L, that 
= (F,(Z(T) <F)) and that L is the direct product of F and 
There exists therefore an element 140 in 7. If f is any element in F, then 
(ft) — f(T <L)t is a finite set so that is a finite set. Since 1s 
the direct product of / and T, it follows from Theorem 2.3 that (7 < Ly’ 
= (7(T) < F) so that finally every set f(4'7)</) for f in F is finite. Now 
it follows from (8.1) that F is algebraic, normal and separable over 7(7) 
so that (i) to (iii) are consequences of (a) to (c). 

If conversely the conditions (i) to (iii) are satisfied, then it follows from 
(8.1) that 7(7) = (F, (Z(T) < F)) and it follows therefore from Theorem 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 573 


3.1 that 7 = (L, (7 < L)) and that the identity is the only T-automorphism 
of L which induces the identity in F. Furthermore it follows from (8.1) 
that every set f‘"<”)’ for f in F is finite, since by condition (iii) and Theorem 
2,3 we have (7) < L)’=(Z(T) < F). By (iii) there exist to every element 


z in L elements ¢; in T, f; in F so that c= > tif;. Consequently is a 
i=1 


subset of the set > tif;{7<”’ which is finite so that (a) to (c) are con- 
4=1 
sequences of (i) to (iii). 
THEOREM 9.2. The group G of automorphisms of the linear system L 
over the commutative field F has the properties: 
(a) the identity is the only (L,G)-automorphism of L that induces the 
identity in and 
(b) js qa finite set for every element in L 
if, and only if, the following conditions are satisfied by G: 


(i) if Sis a normal subgroup of finite index in G, and if S is the cross cut 
of G and ((L,8) < L), then every automorphism in G which induces the 
identily in Z((L,S8)) belongs to S; 


(ii) «© 1s a finite set for every element x in L. 


Proof. It is clear that (ii) is a consequence of (b), since GS 
((1,G) <L). If T= (L,G), then it is a consequence of (a), (b) and of 
(L,G) = (L, ((L,@) < L)) that conditions (a) to (c) of Theorem 9.1 are 
satisfied so that 7’ is complete in L, F is algebraic, normal and separable over 
Z(T), and L is the direct product of F and 7. If S is any subgroup of G and 
B= (L,S), then B= (L,(B<TL)) and it follows from Theorem 3.4 that 
B is complete in L and that B=Z(B)T. If g is an automorphism in G 
so that g’ is a Z(B)-automorphism, then g is a 7'-automorphism and therefore 
a B-automorphism of ZL so that g belongs to the cross-cut of G and 
((L,S) < L); and this contains (i) as a special case. 

Suppose now that conditions (i) and (ii) are satisfied by the group G. 
If uw is any element 40 in L, then denote by S$ the set of all those auto- 
morphisms in G which leave every element in the finite set u® invariant. ‘Then 
S is a normal subgroup of finite index in G; and G/S is essentially the finite 
(transitive) group of permutations which G induces in u&. Since U = (L, S) 
contains w&, it follows that S is the cross-cut of G and (U < L)—an auto- 
morphism, leaving every element in U invariant, has in particular every 
element in w& as a fixed element—so that the conditions of (i) are satisfied 


n 
n 


574 REINHOLD BAER. 


by S. Hence S contains every automorphism in G which induces the identity 
in Z((L,S)) =Z(U).—Since U ~0—for U contains u~ 0—it follows 
from (2.2) that U is complete in Z and that 7(U) = (F,S’). G induces 
in the linear system U over Z(U) a group G* of automorphisms, since S is a 
normal subgroup of G, since (L,S)* = (L, g'Sg), and since therefore every 
automorphism g in G induces an automorphism g* in U. If g is in G, and 
if g* —1, then g is in the cross-cut of G and (U < L) and therefore in S. 
The group G* of all the automorphisms g* for g in @ is therefore essentially 
the same as G/S so that G* is in particular a finite group of automorphisms 
of the linear system U over Z7(U). If g*’ —1, then g’ leaves every element 
in Z(U) invariant so that—as has been remarked before—g belongs to S, i.e. 
g*=1. Finally T= (1,6) [U=(L,S) and therefore (U, G*) = (L,6) 
=T. Hence it follows from Theorem 6.1 that 7 is complete in U, U is the 
direct product of Z(U) and T and Z(U) is finite, normal and separable 
over Z(T). 

Special consequences of this last result—as applied to every w in L—are 
that T is complete in Z and that L = FT; and this implies that (a) holds 
true. 

If ¢A0 is an element in 7, f any element in F, then (ft)© = ft is a 
finite set of elements in L; and consequently f© is a finite set of elements in F. 
Since Z(T') = (F,G’) by (2.2), it follows from (8.1) that F is algebraic, 
normal and separable over Z(7’) so that 7(7) = (F, (Z(T) < F)). 

Suppose now that the elements b;,- --,b, in F’ are independent over 
Z(T). Denote by S the set of all the automorphisms g in G so that g’ leaves 
every element in every set b;© invariant. Since all the sets f& for f in F are 
finite, it follows that S satisfies all the conditions of (i). Hence it follows 
from what has been proved in the second paragraph of the proof that (J, S) 
is the direct product of 7((L,S)) and T, and Z((L,S)) = (F,S’) con- 
tains all the elements bj. Since 7 = (L,G) = ((L,S),G*)—ain the nota- 
tion of the second paragraph of the proof—this implies that the b; are 
independent over 7’ too. Hence JL is the direct product of F and T and now 
it follows from Theorem 9.1 that the condition (b) of our theorem is satis- 
fied by G.* 

Combining the results of Theorems 9.1 and 9.2 we find the following 
corollary which takes the place of Theorem 6. 1 in this section. 


SIt might be worth noting that the conditions (i) to (iii) of Theorem 9.1 have 
been derived here from the conditions (i) and (ii) of this Theorem 9.2 without any 
recurrence to Theorem 3. 1. 


| 
| 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 575 
CoroLuary 9.3. Let T be a subset of the linear system L over the field F. 
Then there exists a group G of automorphisms of L which satisfies conditions 


(i) and (ii) of Theorem 9.2 and which satisfies : 
T = (L,G) 


if, and only tf, T is complete in L, F is algebraic, normal and separable over 
Z(T), and L 1s the direct product of F and T. 


CoroLtuary 9.4. Suppose that the subset T of the linear system L over 
the commutative field F is complete in L, and that F is algebraic, normal and 
separable over Z(T'). Then L is the direct product of F and T if, and only 
if, every Z(T)-automorphism of F is induced by one and only one T-auto- 


morphism of L. 


Proof. It is a consequence of Theorem 2.3, (a) that every 7(7’)- 
automorphism of F’ is induced by one and only one 7’-automorphism of L, 
if only L is the direct product of F and 7. Assume conversely that every 
Z(T)-automorphism of F' is induced by one and only one 7-automorphism 
of L. Then it is a consequence of Theorem 2.3, (b) that L=FT. Since 
(T<L)’=(Z4(T) <F), and since F is algebraic, normal and separable over 
Z(T), it follows from (8.1) that 7(T) = (F, (7(T) < F)) =(F, (T < L)’) 
and consequently it follows now from Theorem 2.3, (c) that every basis of 
the operator group 7 over Z(7') is a basis of the operator group L over F; 
and hence it follows from Theorem 1. 2 that Z is the direct product of F and 7’. 


THEOREM 9.5. A group G of automorphisms of the linear system L over 
the commutative field F such that «© is a finite set for every element & in L, 
and such that condition (i) of Theorem 9.2 is fulfilled by G, satisfies 


G = ((L1,G6) < L) if, and only if, = < F). 


Proof. It is a consequence of Theorem 9.2 and of the conditions im- 
posed on G, that the identity is the only (L,G@)-automorphism of LZ which 
induces the identity in /’ and that every set x{(“6)<¥) ig finite for every 
in. Hence it follows from Theorem 9.1 that (L,G) = T is complete in L, 
that F is algebraic, normal and separable over 7(7') and that L is the direct 
product of F and T. It is a consequence of (2.2) that Z(7') = (Ff, G’), and 
it is a consequence of Theorem 2.3, (a) that every (/', G’)-automorphism of 
F is induced by one and only one (L,G)-automorphism of L. Now our 
theorem is a consequence of Theorem 3. 2. 

It may finally be mentioned that Theorem 6.4 may be extended to our 


576 REINHOLD BAER. 
case with hardly any change. Another immediate consequence of the theorems 
of this section and of (8.3) is the following statement: 


Suppose that L is a linear system over the commutative field F, and that 
the group G of automorphisms of L satisfies: 
(a) 2% is a finite set for every x in L; 
(b) if Sis a normal subgroup of finite index in G, and if S contains all the 
(L,S)-aulomorphisms in G, then S contains every automorphism in G which 
induces a Z((L,S))-automorphism in F. 


Then every subgroup T of G satisfies T= ((L,T) < L) if, and only if, 
G is finite. 


10. Applications of the theory of linear systems to the theory of rings 
and non-commutative fields shall be given in this section. If F# is a ring, if the 
commutative field F is part of the central of R, and if R and F have the same 
identity, then it is clearly possible to consider FR as a linear system over F, 
since this only means restricting one’s attention to the addition in £ and to 
the multiplication of elements in R by elements in F. 


Lemma 10.1. If the field F is contained in the central of the ring R, 
and if S is a subring of R which contains the unit-element of F and R, then 
it is necessary and sufficient for the completeness of S in the linear system R 
over F that the cross-cut of F and S be a field—If furthermore the linear 
system R over F is the direct product of F and S (in the sense of section 1), 
then every S-automorphism of the linear system R over F is at the same time 
an automorphism of the ring R. 


Proof. The first statement of the lemma is clear. If the linear system ki 
over the field F is the direct product of F and of its subring 8, then let B be 
a basis of F over the cross-cut 7(S) of S and F (Z(S) is a subfield of F). 
There exist to every element z in 2 uniquely determined elements s(x, 6) in 


S—all of which with a finite number of exceptions are 0—so that t= > bs(z,b). 
bin B 


If g is an automorphism of the linear system PR over F’, then g applied on / 
alone is an automorphism of the field #. Suppose now that g is an S-auto- 
morphism of the linear system f& over 7. Then 


(zy)* = [ bds(z, b)s(y,d)]* = bedes(x,b)s(y,d) = 
b,d in 


b.dinB 


and this completes the proof. 
This lemma shows in particular that the automorphisms, constructed in 


Theorem 2.3, (a), are ring-automorphisms in our case. 


i 
8 
1 
( 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 577 


As we are giving preference to the subfield F' of the central of the ring R, 
we consider as automorphisms of F only such ring-automorphisms of R which 
map F' upon itself, a hypothesis that will be satisfied for all the ring-auto- 
morphisms of #, in case we assume F' to be the central of R. 

Consequently we use the following notation: If @ is a group of auto- 
morphisms of the ring RP, then G’ is the group of automorphisms which the 
automorphisms in G induce in F. If T is a subset of R, then (T < PR) 
consists of all those automorphisms of the ring & which map F upon itself and 
leave every element in 7’ invariant. 

Now it has to be remarked that it is impossible to make use of Theorem 
2.3, (b), since it may very well happen that there are no 7-automorphisms 
~1 of the ring R which induce the identity in F whereas there may exist 
T-automorphisms + 1 of the linear system R over F which induce the identity 
in F, On the other hand it is obvious that Theorem 2.3, (c) may be used, 
since ring-automorphisms are at the same time automorphisms of the linear 
system. 

Let now 7 be a subring of # which contains the unit-element of ’. No 
other subsets will be considered. Then an element f in F satisfies fT ST 
if, and only if, f is in 7 too; and for this reason we denote by 4(7’) the 
cross-cut of F and 7. T is complete in F# if, and only if, Z(7) is a sub- 
field of F. 

Now it is easy to derive the following statements from Theorems 6. 1 
and 6. 3. 


THEorEM A. Suppose that the cross-cut Z(T) of the subring T of the 
ring R and of the subfield F of the central of the ring R is a subfield of F. 
Then there exists a finite group G of automorphisms of the ring R—all of 
which map F upon itself—such that the identity is the only F-automorphism 
in G, and such that T =(R,G) if, and only if, F is finite, normal and 
separable over Z(T') and the linear system R over F is the direct product of 
T and F. 

TueoreM B. If G is a finile group of automorphisms of the ring R 
all of which map the subfield F of the central of R upon itself, and if the 
identity is the only F-automorphism in G, then 

G= ((R,G) 

The statements we are going to derive from Corollary 9.3 and Theorem 

9.5 concern groups G of automorphisms of the ring R& with the following 


properties : 


(1) F= Fe for every g in G; 


578 REINHOLD BAER. 


(2) 2 is a finite set of elements for every x in R; 

(3) if S is a normal subgroup of finite index in G, and if S contains all 
those automorphisms in G which leave all the elements in (#, S$) invariant, 
then every 7((F,S))-automorphism in G belongs to S. 


Note that a finite group G satisfies these conditions, if its automorphisms 
map fF’ upon itself, and if the identity is the only F-automorphism in G. 

THEOREM A’. Suppose that F is a subfield of the central of the ring R, 
and that T is a subring of R whose cross-cut with F is a subfield Z(T) of F. 


Then there exists a group G of automorphisms of the ined R which satisfies 
the above conditions (1) to at so that 


— (R,G) 


if, and only if, F 1s algebraic, normal and separable over 4(f), and the linear 
system R over F is the direct product of F and T. 


THEoreM B’. If the group G of automorphisms of the ring R satisfies 
conditions (1) to (3), then G’ = ((F,G’) < F) is a necessary and sufficient 
condition for G= ((R,G) < R). 

The following important and obvious consequence of Theorem A’ may 
be stated for future reference. 


LeMMA 10.2. Suppose that the central of the ring R is a field F’, and 
that the group G of automorphisms of the ring R satisfies conditions (2) 
and (3). 

(a) F is the centralizer of (R,G) in R. 
(b) Z((R,G)) is the central of (R,G). 

(b) is a consequence of (a); and (a) follows from the fact that by 
Theorem A’ the linear system R over F is equal to F(R,G), and that F is 
exactly the central of R. 


Finally it may be noted that the following statement may be derived 
from Theorem 3. 4. 


TurorrM C. Suppose that F is a subfield of the central of the ring f; 
and that the group G of automorphisms of the ring R satisfies conditions 
(1) to (3). 

(a) The set B between (R,G) and R satisfies B= (R,(B< R)) if, and 
only if, there exists a field S between Z((R,G)) and F so that B = S(R, 6). 
(b) If S is a field between Z((R,G)) and F, then S=Z(S(R,G)). 


[It is the main-objective of this section to show that in case of (non- 


co 
th 
in 
SU 
th 
re 
ge 
sy 
al 
@2 
C 
a 
0 
a 
a 
( 
{ 


ill 


1§ 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 579 


commutative) fields it is possible to prove an essentially stronger theorem 


than Theorem C. 


LemMa 10.3. Jf the central F of the ring R is a field, if @ is a finite 
group of automorphisms of R such that the identity is the only automorphism 
in G which leaves every element in F invariant and such that (R,G@) is a 
subfield of R, if W is a ring between (R,G) and R whose cross-cut with F is 
the same as the cross-cut of (R,G) and F, then W = (R,G). 


Note that every automorphism of # maps F upon itself, since F is the 
central of /i, and that (R,G@) though a field need not be a commutative field. 


Proof. It is a consequence of Theorem A that F is finite, normal and 
separable over the cross-cut 7(K) of K = (R,G) and F, and that the linear 
system F over the commutative field F' is the direct product of F and K. 
There exists therefore by (4.5) an element 6 in F so that the elements b# for 
gin G form a basis of F over Z(K), since the elements in G induce in F 
an isomorphic group of automorphisms, and since (f7,G) =7Z(K). There 
exist furthermore to every element x in R& uniquely determined elements 
c(z,g) in K so that 2 g) and belongs to K if, and only if, 

gin 


all the elements c(z, g) are equal. 

Assume now that K < W. Then there exist in W elements which are 
not contained in AK; and amongst these there is one, w, so that the number 
of coefficients ¢(a, g) 4 0 is as small as possible. Since w is not in W,w-~ 0 
and there exists an automorphism v in G so that c(w,v) ~ 0. 

Let now ¢ be any element in K. Then 
twe(w,v)*—we(w,v)tt= > [te(w,g)c(w, v)-*—c(w, g)c(w, bs 

ginG 
would be in W and the number of its coefficients 40 would be smaller than 
for w. Hence this element is in K so that all its coefficients are equal. Since 
at least one of these coefficients is 0, all the coeflicients are 0 so that 
tc(w, g)c(w, = c(w, g)c(w, v')t for every in kK, g in G. 

Now it follows from Lemma 10.2 that z(w, g) =c(w, g)c(w.v)? is an 
element in F, and since z(w, g) is an element in K, it is in the cross-cut Z(K) 
of K and F. w= > 2(w, g)b*c(w,v) =fc(w,v). Since w 0, 


ginG 
/ is an element in the cross-cut of W and F; and it follows from our hypothesis 


that f isin Z(K). Thus w would be in K and this is a contradiction so that 
finally W = K. 


THEOREM 10.4. Jf G is a finite group of automorphisms of the (non- 
commutative) field Q such that the identity is the only automorphism in G 


— 


580 REINHOLD BAER. 


which leaves every element in the central F of Q invariant, then every ring R 
between (Q,G) and Q whose cross-cut with F is a field satisfies: 


f= (Q,(k < Q)). 
Note that every (Q,H) is a subfield of Q so that the condition imposed 
on f is necessary and sufficient and implies that F is a field. 


Proof. Denote by Z(F) the cross-cut of R and F. Z(R) is a subfield 
of F which contains the cross-cut Z(K) of K=(Q,G) and F. Put 
S=Z(Rk)K. Then K=S=R; and it follows from Theorem A and 
Theorem C that =Z(Rh) and S=(Q,(S <Q)), since F is finite, 
normal and separable over Z(K) and since therefore the field Z(2) between 
Z(K) and F satisfies: = (PF, (Z(k) < F)). This implies in par- 
ticular that S is a field. Since (S <Q) SG, it follows now from Lemma 
10. 3 that S = # and this completes the proof. 


THEOREM 10.5. If K is a subfield of the field Q such that the identity 
is the only K-automorphism of Q which leaves every element in the central 
F of Q wmvariant, such that the sets x'*<® are finite for every element z in 
Q, and K = (Q, (K < Q)), then every ring R between K and Q whose cross- 
cut with F is a field satisfies: 


k= (Q,(k<@)). 


Proof. It is a consequence of Theorem 9.1 that F is algebraic, normal 
and separable over the cross-cut Z7(K) of K and F, and that the linear system 
Q over F is the direct product of F and K i.e. the group (K < Q) satisfies 
the conditions (1) to (3). If the cross-cut Z(R) of the ring RP between K 
and Q is a field, then Z(K) is a field between Z(K) and F; and it follows 
from (8.1) and (8.2) that (P,(Z(R) < F)) =Z(R). It is then a con- 
sequence of Theorem C that the domain S =Z(R)K satisfies: S =(Q, (S < Q)) 
and Z(S) =Z(R). Since (S <Q) = (K < Q) it follows that the identity 
is the only S-automorphism of Q which induces the identity in F and that 
every set x'S<®) ig finite. Finally it is clear that S is a field which is con- 
tained in RP. 

Suppose now that w is any element in R. Denote by U the set of all the 
S-automorphisms of Q which leave all the elements in u(S<®) invariant and 
put V=(Q,U). It is clear that U is a normal subgroup of finite index in 
(S <Q), that u'S<®@ <YV and that therefore U = ((Q,U) <Q). It isa 
consequence of Theorem 9.2 that an S-automorphism of Q which induces the 
identity in R(V) leaves every element in V invariant. Thus (S < Q) induces 
in the field V with central R(V) a finite group G of automorphisms so that 
the identity is the only automorphism in G which induces the identity in 


| 
ft 
t 
¢ 
( 
( 
( 
( 
( 
t 
| 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 581 


Z(V) and so that (V,G@) = 8S. Denote now by D the cross-cut of V and K. 
Disa ring between S and V which contains wu and whose cross-cut with 7 (V) 
is just Z(R) = Z(S). Hence it follows from Lemma 10.3 that D=S so 
that in particular w is an element in S. Hence SRF and this completes 
the proof. 


CHAPTER III. Crossed Products.°® 

11. The extension of the concept of crossed product we are going to 
give here concerns itself with a (not necessarily commutative) field Q whose 
central may be denoted by F and a group G of automorphisms of the field Q 
which is subject to the following conditions: 
(I) @Q ts the direct product of F and the subfield K = (Q,G) of Q; 
(II) K = (Q,(K < Q)). 

Two important inferences of (1) may be stated at once. 
(I’) The identity is the only F-automorphism in G. 
(I”) F is the centralizer of K in Q; and the central of K is tts cross-cut Z 
with 

tiven condition (1), one verifies that (II) is equivalent to the fol- 
lowing condition : 
(I1’) Zan < F)). 

It may be noted furthermore that a consequence of (I) is 


(I*) Every Z-aultomorphism of F is induced by one and only one K-auto- 
morphism of Q. 

Upon occasion we shall have to use the further restriction: 
(IIT) G=(K <Q). 

In denoting by g’ the automorphism of # which is induced by g in Q, 
it is a consequence of (I*) that (III) is equivalent to the following assumption. 
(IIT’) = (2 < F). 

Conditions (I) to (III) are satisfied by all those finite groups @ of 
automorphisms of ( whose only F-automorphism is the identity—Conditions 
(I) and (II) are satisfied by the more general class of groups which satisfy 
the conditions (2), (3) stated in section 10. 

Now we connect with every element g in G an indeterminate u(g) and 


consider the system QG of all the linear forms: 


*For a presentation of the classical theory of crossed products ep. e.g. H. Hasse, 
Transactions of the American Mathematical Society, vol. 34 (1932), pp. 171-214. 


R 
se(| 
ut 
nd 
en 
T- 
a 
ty 
al 
m 
i] 
t 


582 REINHOLD BAER. 


gin 


¥(8)q(8) 


where the g(g) are elements in Q all but a finite number of which are 0. | 
It is clear how to add two such forms and how to multiply them by elements 


in Q [from the right]. 
In this linear system QG@ over Q a multiplication shall be defined which 


is subject to the following rules: 
(A) qu(g) =u(g)q* for q in Q and g in G; 
(F) if g and hare two elements in G, then there exists an element (g,h) 
in Q so that u(g)u(h) = u(gh) (g,h). 
The elements (g,h) are called a factor-set and the linear system Q6 
enriched by this multiplication is termed the crossed-product 
(Q, G, (g,h)). 
(11.1) The multiplication in (Q, G, (g,h)) is associative if, and only if, 
(i) every (g,h) is in F; 
(ii) (r,st)(s,t) = (rs,t)(r,s)' for r,s,t G. 
Proof. Suppose first that the multiplication is associative. If q is any 
element in Q and g,h are elements in G, then 
u(gh) (g,h)q = u(g)u(h)g = *u(g)u(h) = @)*u(gh) (g, h) 
= u(gh)q(g,h) or 
(g,h)q=4(g,h) for every g in Q so that (i) holds true. 
If furthermore r, s,t are three elements in G, then 
u(rst) (r, st) (s,t) = u(r)u(st) (s,t) = u(r)u(s)u(t) 
= u(rs)(r,s)u(t) = u(rs)u(t) (r,s)! 
= u(rst) (rs, t) (r,s)* 
and this proves the necessity of (ii). 
If conversely (i) and (ii) are satisfied, then 
u(r)a(r)[ Bu(s)b(s) 
— u(r)a(r) 
w(r)a(r) 
— u(r)a(r)u(st) (8, )b(s)'c(t) 
= u(r)u(st)a(r)*(s, £)b(s)*c(t) 
— u(rst) (r, st) (s, t)a(r)"b(s)*c(t) 
— u(rst) (rs, t) (r, 
r,8,t 


| 

i 

| 


nts 


IG 


ny 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 583 


— u(rs)u(t) (r, 
= w(rs) (r,s) 
= 
[u(r)a(r)u(s)b(s) 
=[Zu(r)a(r) u(s)b(s)] 


and this completes the proof. 

This statement explains why we have to and are going to restrict our- 
selves to the consideration of factor-sets which satisfy the above conditions 
(i) and (ii). 

As one verifies easily that the element w(1)(1,1)-? is the unit element 
in (Q, G, (g,h)), we may assume without loss of generality that 


(iii) (g,1) = (1,h) =1 or u(1) = 1. 
Finally one verifies that u(g~*) (g, g"')~* is the inverse to u(g). 


12. In this section we discuss the general structural properties of crossed 


products 
P= (Q, G, (g, h)) 


where Q is a field, G a group of automorphisms of Q which satisfies (I) and 
(II), and where (g,h) is a factor-set of G in Q which satisfies (i) to (iii). 
(12.1) P as simple. 
Proof. Suppose that W is a two sided ideal ~O in P. Then there 
exists among the elements w= > u(g)q(w,g) 0 in W at least one such 
ginG 


that the number of g with q(w, g) ~0 is as small as possible. Let v be such 
an element, and suppose that wu is an automorphism with q(v,u) ~0. If g 
is another automorphism in G, and if ug, then it is a consequence of (1*) 
(in section 11) that uw’ ~ g’ and there exists therefore an element f in F' so 


that f* 4 fe. Clearly fv —vf* = [ (f* — f)q(v, h)] is an element 


in W; and it is 0, since the number of its coefficients ~ 0 is smaller than for v. 
Hence in particular (f¢—f“)q(v,g) = 0 and this implies q(v,g) =0 for 
every gAu. This implies that u(w) itself is an element in W; and hence 
all the u(g) are in W, i.e. P= W. 


(12.2) (Q,G,(g,h)) is the direct product of (F,G’, (g,h)) and K. 


This is an obvious consequence of condition (I) in section 11. 
An interesting consequence of this statement and of a well-known property 


584 REINHOLD BAER. 


of crossed-products of commutative fields by finite groups of automorphisms 
may be stated separately. 


(12.2*) Jf G is a finile group of automorphisms of the field Q so that the 
identity is the only central-automorphism in G, then (Q, G, (g,h) =1) isa 
full matriz-algebra over the field (Q,G). 


(12.3) An element w in P = (Q,G, (g,h)) satisfies wF = Fw if, and only 
if, it has the form u(g)q for some g in G and q in Q. 


Proof. That elements of the form w—u/(g)q satisfy wF = Fu, is 
obvious. If on the other hand w~ 0 satisfies wFY = Fw, then there exists to 
every elementf in F an element f* inF so thatfw = wf*. Ifw= u(g)q(w,g), 

gincG 


then this implies that féq(w, g) = f*q(w,g) for every g in G. If u and v 
are two different automorphisms in G so that both q(w,u) and q(w,v) are 
different from 0, then this would imply that f« =f? for every f in F; and 
this is impossible by (I*) of section 11. Hence w= u/(g)q. 


(12.4) @Q is the centralizer of F in FP. 
If w is an element, satisfying wf = fw for every f in F, then it follows 
from (12.3) that w= u/(g)q and it follows from (I*) that g—1. 


(12.5) @Q is uniquely determined as the greatest subfield of P which is con- 
tained in the normalizer of F in P. 


Proof. Suppose that U is some subfield of the normalizer of F' in P. 
Then it follows from (12.3) that every element in U has the form u(g)q. 
If u(g)qg and u(h)p are two elements in U which are both different from 0, 
then their sum is in U and therefore of the one-term-form, i.e. g=h. For 
the same reason g = g?, i.e. g=1 so that U=Q. 


(12.6) Z is the central of P. 


For elements of the central belong to Q by (12.4), hence to F. They 
belong to K and therefore to Z, since they permute with the elements u(g). 


(12.7) KK is the centralizer of (F,G’, (g,h)). 
This follows from (12.4), since the u(g) are in (Ff, G’, (g,h)). 


13. It is a consequence of (12.3) that the normalizer of both the fields 
F and Q in P= (Q,G, (g,h)) is—apart from 0—the group, generated in 
adjoining the elements u(g) to Q. Denote the set of all the elements w in 
P which satisfy: wF = Fw—or wQ = Qw—by N;; so that the elements #4 
in N form the group, we have described just now. 

Every automorphism of P maps the central Z = (IF,G’) upon itself. 
But there may exist automorphisms of P which do not map F' upon itself. 


| 

l 

( 

i 


18 


sd 


we 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 585 


If however an automorphism of P maps F upon itself, then it maps Q and V 
upon themselves. If conversely an automorphism of P maps N upon itself, 
then it follows from (12.5) that this automorphism maps Q upon itself; and 
if an automorphism maps Q upon itself, then / is mapped upon itself too, 
since F' is the central of Q. In this section we shall investigate those auto- 
morphisms of P which map F, Q and N upon themselves. 

If the automorphism r of P maps Ff, Q and N upon themselves, then 
r induces an automorphism r* in Q; and r*’—in the usual notation—is the 
automorphism which r and r* induce in Ff. Since r induces an automorphism 
in the group N* which maps the cross-cut Q* of Q and N* upon itself, it 
follows that r induces an automorphism in the quotient group N*/Q*, and 
since G and N*/Q* are essentially the same, it follows that r induces an 


automorphism r” in G, and that 
(a) u(g)’ =u(g" )r(g) with r(g) in Q*. 


Applying now r upon condition (A) of section 11, we find that 


= qru(g” )r(g) = = (qu(g))” 
= (u(g)q®)" = qe" 
or 
(b) =r(g)qs"* for in Q and g inG 


or what amounts to the same 


g@—=r(g) (g). 
Applying the automorphism r upon condition (F) of section 11, and in 
using conditions (i), (ii) of (11.1), it follows that 
= )u(h” 
—r(gh)(g"", )r(g)""r(h) or 
(c) =r(gh) r(g)’’r(h) for g,h in G. 


Thus we have seen that every automorphism r of P which maps F, Q 
and V upon themselves induces an automorphism r* of Q, an automorphism 
Yr” of G and—by (a)—a Q-valued function r(g) of the elements in G; and 
it is obvious that r is uniquely determined by r*, r” and r(g). 

THEOREM 13.1. Suppose that r* is an automorphism of Q, r’ an auto- 
morphism of G, and that r(g) is a Q-valued function of the elements g in G. 
Then there exists an automorphism r of P which induces r* in Q, vr” in G and 


satisfies (a) if, and only if, r*,r”’ and r(g) obey the rules (b), (c). 


Proof. The necessity of these conditions has already been verified.— 


9 


586 REINHOLD BAER. 


Thus assume that (b) and (c) are satisfied by r*,r” and r(g). If @ is any 
element in 7, then there exist uniquely determined elements q(z, g)—all of 
which with a finite number of exceptions are 0—so that 


c= u(g)q(z, 8). 


ginG 


A transformation r of P may be defined by 


(d) a= )r(g)q(2, 8)” 


ginG 
and this transformation satisfies clearly (a), induces r* in Q, since u(1) = 1; 
and it will be clear that r induces r’ in G, as soon as we have proved that r 
is an automorphism of ?. This transformation is a one-one-correspondence, 
mapping P upon the whole set P. since the equation y” = a possesses one and 


only one solution y in P, namely the element y with the coefficients q(1 


)r*. 
multiplication is verified as follows: 
[ u(gh) (g,h)q(x,g)"q(y, h) 
= 2 )r(g)u(h") r(h)q(a, g)""q) (y, hy)" 
= g)u(h”)r(h)q(y, hy = ary" 


That r preserves addition is clear; that it preserves 


and this completes the proof. 


festricting (b) to elements in /’ only we find 
(b”) 


and in applying (c) on g = h = 1 we find that 


(c’) r(1) =1 
and in applying (c) on h = g™ we derive from (c’) that 


14. In this section we add successively new hypotheses to those used 1 
the preceding sections. To the hypothesis that r*, r’ and r(g) satisfy the 
conditions (b). (c) of section 18 we add first: 


(1) r*® is an element in G’. 


| 
| 
| 


A GALOIS THEORY OF LINEAR SYSTEMS OVER COMMUTATIVE FIELDS. 587 


We note first that this assumption is certainly satisfied, whenever r* is a 
Z-automorphism of Q and G satisfies condition (III) of section 11. 
From (1) it follows that there exists an automorphism w in G so that 
w =r*’, Then it follows from (b”) that 


and this implies by (I*) of section 11 that 


(1’) g” = w 
Let now be s* = r*w!, Then it follows from (b) that 


(1”) yer (gy! (gy 


Now we add another hypothesis. 
(2) Bevery F-aulomorphism of Q is an inner aulomorphism of Q. 


It is known that this hypothesis is a consequence of the finiteness of Q 
over F, and that this hypothesis is not always satisfied. 

Since s*”==1, from (2) follows the existence of an element b in @ 
so that 
s* 


=b-1qb for every in Q. 


Applying this on (1%) we find 

g* = | br(g) br(g)-” | 
and it is a consequence of the fact that FV is the central of Q that 
br(g)-’'b-* is an element in I’. Hence there exists an element f(g) in F 
so that 
(2’) = f(g) f(g)b gw)yw 


and it is a consequence of (c) that this /-valued function f(g) satisfies 
(2”) = f(gh)4f(g)e (h). 


If the F-valued function f(g) satisfies condition (2”), then the identity- 
automorphism of Q together with the inner automorphism which is induced 
in G by w together with this function f(g) satisfy the conditions of the 
Theorem 13.1 so that they are induced by an automorphism of P which is a 
Q-automorphism of P and therefore a central-automorphism of P. 


If we now add the final hypothesis that 


(3) K-aultomorphisms of (F,G’, (g,h)) are inner automorphisms then it 
follows from the existence of an automorphism of P which leaves all the 
elements in Q invariant, induces in G the inner automorphism effected by w, 
and induces f(g) according to (a) of section 18, that there exists an element 


= 


588 REINHOLD BAER. 


in (F,G’, (g,h)) which induces this automorphism. This element has by 
necessity the form u(w)f for f in F’, and now it follows that 


u(wigw) f(g) = fru(w)u(g)u(w)f 

= 

= w)“u(w gw) (w"!, gw) (g, w)f 

or 
f(g) = (g/w) for (g/w) = (wt, (wt, gee) (gw) 
and the function r(g) has by (2’) and (3’) the form 

r(g) = (b"f) (g/w). 

The most important special case of all these considerations may be stated 
separately. 

If the field Q is finile over its central F, if G is a finite group of aulo- 
morphisms of Q so that the identity is the only F-automor phism in G, if (g,h) 
is a factor-set of G in F which satisfies conditions (ii) and (iii) of section 11, 
if r* is an (F,G’)-automorphism of Q, r” an automorphism of G and r(g) 
a (Q-valued function of the elements in G so that 


qr*e’r(g) =r(g)qe” for q in Q, g in G, and 
=r(gh) r(g)"'’r(h) for g,h in G, 
then there exists an element w in G and an element v in Q so that 


w’ = r*’, = and 
r(g) = vw (g/w) 


where 


(g/w) = gw) (g,w). 
If we choose in particular the factor-set (g,h) = 1, then we see: 
If w is an automorphism in G and r(g) a Q-valued function so that 
r(gh) 
then there exists an element v in Q so that 
r(g) 
Finally it ought to be mentioned that the element v induces in Q the same 


automorphism as 


THE UNIVERSITY OF ILLINOIS, 
URBANA, ILL. 


i 
| 
| 


THE NUMBER OF REPRESENTATIONS FUNCTION FOR BINARY 
QUADRATIC FORMS.* 


NEwMAN A. 


The problem of finding the number of representations of an arbitrary 
integer by a given binary quadratic form has yet to be solved in complete 
generality. In the two centuries that have followed the first general investi- 
gation by J. L. Lagrange * of any part of the problem, the investigations have 
proceeded in two directions. A great number of specific forms have been con- 
sidered individually for which more or less general solutions have been given. 
Again, certain general investigations have reduced the problem to more simple 
and direct questions. The early investigations of Dirichlet * and more recently 
those of Pall * are of this nature. 

In the discussion to follow, we offer as a contribution to the general 
problem, the general explicit expressions for the number of representations 
function for all forms whose discriminant is such that there is a single class 
in each genus together with a specific example showing the numerical com- 
putation of the number of representations. 

We are concerned with binary quadratic forms designated by [a, b,c], 
of discriminant — A = b? — 4ac, and shall examine the form of the number 


of representations function 
N[m = au? + bry + cy?] 
this being the number of solutions in integers, x and y, of 


m = ax + bry cy’. 


As is customary only forms which are positive definite and whose coefficients 
have no common factors, i.e. are primitive, will be considered. 
We shall base the investigation on the following theorem of Dirichlet: ° 


* Received September 20, 1939. 

? Presented to the American Mathematical Society September 6, 1938, cf. Bulletin 
of the American Mathematical Society, vol. 44 (19388), p. 488. 

“J. L. Lagrange, “ Recherches d’Arithmetique,’ Oeuvres, t. 3, pp. 693-785. 

*G, L. Dirichlet, Zahlentheorie, ed. 4 (1894), p. 229. 

*G. Pall, Mathematische Zeitschrift, vol. 36 (1982), p. 321-343. 

* Dirichlet, loc. cit. 


589 


| 
— 
Cc 
| 
| 


590 NEWMAN A. HALL. 


THEOREM 1. Let m be positive and prime to A. The number of repre- 


sentations of m by all the reduced forms of discriminant — A is w > (— A/n) 
p/m 
where w= 2, if A>4; o—4, if A=4;.0—6, if A=3, and (—A/p) is 


Kronecker’s symbol. 


There are quantities associated with a particular form, invariant in that 
they are equal for all integers represented by said form which separate the 
forms of given discriminant into genera which may or may not coincide with 
the several classes. These invariants, the so-called characters, are defined by 


THEOREM 2.° If py, p.: * +. px are the distinct odd prime factors of A, 
then (n/pi) has the same value for all integers n prime to A, represented by 
a form [a,b,c] of discriminant — A. When A is even, A= —4D, the same 
is true of 

6 = (—1)%:'"), if D=0 or 3 (mod 4) 
e= (— if D=0 or 2 (mod 8) 
be, if D=0 or 6 (mod 8). 


The set, C1, C»,- + will stand for the characters belonging to a 
certain discriminant, excluding ée if both 6 and « are characters. The number 
of these, h. will equal &, hk + 1, or k + 2 according to the nature of the dis- 
criminant as indicated above. The notation, Ci(n), is to represent the value 
of the character C; for n of the form representing n. 

All forms of a given discriminant whose characters have the same value 
are said to form a genus. Since equivalent forms represent the same numbers, 
all forms in the same class are in the same genus. 

When there is a single class in each genus we may proceed using these 
characters to give the explicit form for the number of representations func- 
tion for integers m prime to 2A. If [a, b,c] represents some integer s, Theorem 
2 states Ci(m) = C;(s) as a necessary condition that m be represented at all 
by [a. b,c]. Since we assume a single class in each genus, each reduced form 
has different values for the characters. Hence by Theorem 1 we obtain 


Turorem 3. Let [a,b,c] be a form of discriminant — A < —4 such 
that there is a single class of forms in each genus. If m is an integer prime 
to 24 

N[m = ax* + bry + cy?] 


[1+ Ci(a)Ci(m)] (—A/p). 


Qh-1 


i=l usm 


®., E. Dickson, Introduction to the Theory of Numbers, pp. 82, 87. 


| 


REPRESENTATIONS FUNCTION FOR BINARY QUADRATIC FORMS. 591 


In order to extend this result to the number of representations of numbers 

not prime to the discriminant, there are required three auxiliary theorems. 

LemMA 1. Let [a,b,c] be a form of discriminant — A. Let p be a prime 
such that either p° divides A and p > 2 or p= 2 and A=0 or 12 (mod 16). 
Then 

N[ pm = ax’ +- bry + (m, p) = 1] =0 


and 
N[p?m = ax? +- bry + 
= N[m = a’a* + b’ry + 
where {a’,b’,c’| is a form of discriminant —A/p? whose characters are re- 


spectively equal to the corresponding ones for [a, b,c}. 


LemMa 2. Let [a,b,c] bea form of discriminant — A, and let the prime, 
p, divide & but not satisfy the conditions of Lemma 1. Then 


N[ pm = ax? + bay + cy?] 
= N[m = + b’ay + cy?] 
where {a’,b’,c’|] is a form of discriminant —A whose characters are equal 
lo the product of the corresponding characters for [a,b,c] and those for the 


form of discriminant —A representing p. 


Lemmas 1 and 2 are taken directly from theorems stated by G. Pall * 
with our added condition that there be a single class to a genus. 

If [a, b, c¢] has a discriminant — A == — 3 (mod 8), then since 
A = — 4ac, a,b, and must be odd, so that if ax? +- bry + cy*=0 (mod 2), 
then 2? + cy + y?=0 (mod 2). If « were odd, y could be neither odd nor 
even, thus z and y must both be even, x == 2, y = 2, and 

ax* + bey + cy? =0 (mod 4). 

This proves 

LemMa 3. Let [a,b,c| be a form of discriminant — A==— 3 (mod 8). 
Then 

N = ax? + bry + cy’, m odd] 

N|m = ax* + bry 4+- cy*|, even 
= ), r odd, 


The number of representations function. In the theorems below giving 
the explicit form of the number of representations function for all cases where 


there is a single class of forms to a genus C;(s) is to stand for the value of 


*G. Pall, Mathematische Zeitschrift, loc. cit., pp. 331-332, Theorems 4 and 5. 


) 

is 

it 

h 

a 

e 

n 

| 

n 

h 

i 


592 NEWMAN A. HALL. 


the characters for the positive, primitive form [a,b,c] of discriminant — 4, 
= > (—A/»), where the summation is taken over all 


and we write 
u/m 


divisors » of m. 


= 3 (mod 8), A> 3, where the p; 


THEOREM 4. If A= po: 
are distinct odd primes, 


N[2?p,%- tem = ax? + bry + cy*, (m, = 1] 
h k 

1+ (Ci (pj) JMCi(s)Ci(m) (m). 
j=l 


The powers of the odd primes, p;, in the number represented are reduced 
according to Lemma 2. Whence by Theorem 3 the statement follows. The 
even power 2A is required by Lemma 3. The factor multiplying F(m) occurs 
in this manner merely to associate the plus or minus one value of the char- 
acters with the representation or non-representation according to Lemma 2. 

It has been shown previously * that if A= 7 (mod 8), the only discrimi- 
nants for which there is a single class to a genus are A=7 and A= 1). 


These are included in 


THEOREM 5. If A= pypo, pi =3 or %, po=5 or 1, respectively, 


N[2p,.%p.%m = + bry + (m, 2A) = 1] 


IT {1 + [Ci(pi) [Ci (pz) }F(m). 


The powers of the odd primes, p;, in the number represented are reduced 
according to Lemma 2, while the power of two is reduced by use of results 
stated by Dickson ® on forms of discriminant —7 and —15. The theorem 


follows from Theorem 3. 
The only odd discriminants containing the square of a prime as a factor 


which have a single class to a genus are: 


A = 27%, 75, 99, 147, 315.2° 


These are included in 


3, 3, 11,3 


THEOREM 6. If A= where p, = 3, 5,3, 7,33 pops = 


respectively, 


N. A. Hall, Mathematische Zeitschrift, vol. 44 (1938), p. 88. 
® Dickson, loc. cit., pp. 81, 88. 
1° Hall, loc. cit. 


f 


| 
| 
L 
| n 
4 


REPRESENTATIONS FUNCTION FOR BINARY QUADRATIC FORMS. D983 


= ar? + + cy”, (m, 24) = 1] 
= {1 +1 [Ci (pi) ]%Ci(s)Ci(m)}F(m), a =0 
=(, =—1 
I (1 + II [Ci (pj) Ci (pie ?m) JF (pit ?m), = 2 


where » = 3 when pop, and otherwise, and is the character 
/ 1 


associated with py. 

The power of p,; in the number represented is reduced according to 
Lemma 1, while those of p. and ps; are reduced according to Lemma 2. The 
statement then follows from Theorem 3. The even power 2A is required by 
Lemma 3. 

THEOREM 7%. Jf or 8 (mod 16), AA-+4, where 


pr = 2, 9== 2 or 3 and the remaining pj; are distinct odd primes, 
N[ prt po% = ax? + bry cy’, (m, A) = 1] 
ITC (pj) 


1 

The powers of the primes, pi, in the number represented are reduced 
according to Lemma 2. Whence by Theorem 3 the statement follows. 

When A = 0 (mod 16), there is more than a single class to a genus unless 
A=16n or 64n, n= 1, 3, 7, 15, or unless A = 32 (mod 64). The latter 
case is included in 

THEOREM & Jf A= pe, where p,=2 and the remaining pi 
are distinct odd primes, 

N[pimp.% - = ar? + bry +- cy*, (m, A) = 1] 

a 1 


== (), 


| 


where Cy, is the character for forms of discriminant —A not a character for 
forms of discriminant —A/4. 
The power of p, = 2 in the number represented is reduced according to 
Lemma 1 and those of the odd primes, p;, according to Lemma 2. The state- 
ment is then a consequence of Theorem 3. 

When A == 12 (mod 16) the only discriminants for which there is a single 


Hall, loc. cit. 


| 
= 


294 NEWMAN A. HALL. % 
class of forms to a genus are A = 12, 28, 60.1 These together with A = 16, 
and 64n, n = 3, 7, 15, and A =3 are included in 

THeoreM 9. [f A= 2%p,p., where 6=0, 2, 4. 6, and pips = 3, 7, 15, 


= ar? +- bey + cy?, (m, 2A) = 1] 


IT (pi) AZO 
h h 
1 


where 
=3, A even 
o=0, pip.=3, A odd 


and are in this case the characters for A= and Chr, 
the additional characters for A= 16p,po and GAp,ps respectively. 
The power of 2 in the number represented is reduced according to Lemma 
1, and those of p, and p, according to pregnie! 2. These reductions together 
with the appropriate uses of Theorems 1, 3, and 5 provide the statement given. 
The even discriminants A = 4, 16, 64 are included in 


THEOREM 10. If A=4, 
N[2¢m = + bry + (m, 2) = 1] 


= 4F(m). 
If A= 16, 
N[2¢%m = ax? + bry + cy’, (m,2) = 1] 
=wol'(m), 
o=2, o=—0, c—1; o—4, 2. 
If A= 64, 


N[2¢m = aa* + bry + cy’, (m,2) = 1] 
= 4/1 + 8(s)8(m) ] [1+ «(s)e(m) (m), a= 0 
= (m) 
o=0, ¢=1,3; —4, a= 4. 
The power of 2 inthe number represented is reduced according to Lemma l. 
| g 


Theorem 3 completes the statement. 


12 Hall, loc. cit. 


~ 2h | _| 
j=l 
1 h+? h 
+ [Ci Ci(s)Ci(m) P(m), A= 6—6 
i=1 j=1 
=0, A<O, dA odd, 
o=A—6+ Pips = or 15 


On 


REPRESENTATIONS FUNCTION FOR BINARY QUADRATIC FORMS. 595 


The only even discriminants containing as a factor the square of an odd 


prime which have a single class to a genus are: 
& == 36, 72, 100, 180, 288."* 


The number of representations function for these even discriminants is 


given by 
THEOREM 11. If A = 2937, 0 = 3 or 5, 
== ar? + bry (m, A) = 1] 
AL [1+ |F(m), B=O 
at [1 + Ci(2)8Ci(s) Ci (38m) B= 2 


B=1. or 


where Cy = (n/3). If A= 2*p*, p=83 or d, 


N[2¢pPm + bry + cy?, (m, A) = 1] 


paw. 


== ax? 4- bry +- cy*, (m, A) = 1] 
[1 Ci(2) (s)Ci(m) |F(m), B=0 


[1 + (5) (s) JF (38m), B= 2 


Qh-2 
B 
where (n/3). 


The reductions are again made according to Lemma 1 and 2, and the 
statements follow from Theorem 3. 
The numerical calculations for the number of representations will be 


aided by 


TiikoreM 12. When there is a single class of forms to a genus and A is 


nol divisible by the square of an odd prime, 2%, or 2°, 


h 
'(m) =X == Ci(w) 
p/m u/m i=l 


1% Hall, 


cit. 


loc, 


ld 
py 
If == 27-32-35, 


596 NEWMAN A. HALL. 


When p*/A, p an odd prime, 


h-1 
— 
where Cy, = (n/p). 


This theorem is directly evident from the definition of the characters 
given in Theorem 2 and from the law of quadratic reciprocity. 


Numerical computations. As indicated above Theorems 4 through 11 
give the explicit form of the number of representations function for all forms 
of discriminant with a single class to a genus. In specific cases the application 
of these results will require the knowledge of the characters for the form and 
the values these characters assume for forms representing various numbers 
and for dilferent genera of the same discriminant. This information can be 
readily calculated from the definitions given in Theorem 2. The author has 
prepared a table giving this data for all known discriminants having a single 
class to a genus.** This table lists all the reduced forms of given discriminant 
together with the several characters and the values they assume for the num- 
bers represented by each of the reduced forms. 

As an illustration of the method consider the form, 22? + 35y°, of dis- 
criminant — 280 = — 2°-5-%. The description of the characters as calcu- 
lated or read from the table referred to above can be presented compactly: 


280 = 2-5-7 «(n) (n/5) (n/7) 
[1, 0, 70] 1 1 1 
[2, 0, 35], 2 1 
[5, 0,14], 5 me | 1 | 
[7, 0,10], 7 1 | | 


The reduced forms of discriminant — 280 are: [1, 0, 70], [2, 0, 35], [5, 0, 14], 

[7%, 0,10]. The prime factors of the discriminant, 2, 5, and 7, are represented 

by the last three of these respectively. There are three characters, e(n), (1/9): 

(n/7), which take on the values listed for numbers represented by the form 

opposite. The number of representations function is given for this case by 

Theorem 7. The function is accordingly: 

N[2%5%7%m 2x? + 35y", (m, 70) = 1] 

u/m 

We have, furthermore, 


14N. A. Hall, California Institute of Technology, Thesis (1938), pp. 104-116. 


a 
| 
| 


e(m) = 1, m=l1or7 (mod 8) 
=—1, m=3or5 (mod 8) 
(m/5) = 1, ms=1or4 (mod 5) 
=—1, m+=2or3 (mod 5) 
(m/7) = 1, m=1,2,o0r4 (mod 7) 
=—1, m=3, 34, or 6 (mod 7). 
r Hence we may separate integers, m, (m, 70), into residue classes, modulo 280, 
. with the triplet e(m), (m/5), (m/7), identical in value for all integers in 
the class: 
e(m) (m/5) (m/7) 
+ 4 1, 9, 39, 71, 79, 81, 121, 151, 
" 169, 191, 239, 249. 
11, 29, 51, 99, 109, 141, 149, 
179, 211, 219, 261, 221. 
3) 23, 5%, 113, 127, 177, 183, 193, 
207, 233, 247, 263, 137. 
4) 3%, 53, 6%, 93, 107, 123, 163, 
: 197, 253, 267, 277, 43. 
he 31, 41, 111, 89, 129, 159, 199, 
201, 209, 241, 271, 279. 
6) _ 4. we 19, 59, 61, 69, 101, 131, 139, 
171, 181, 229, 251, 269. 
7) + -- — 17, 33, 47, 73, 87, 97, 103, 183, 
223, 257, 143, 16%. 
3, 13, 27%, 83, 117, 157, 173, 


) a) even even 
odd odd 
odd even 
d) even odd 


-++- 


even 


even 


odd 


odd 


REPRESENTATIONS FUNCTION FOR BINARY QUADRATIC FORMS. 


even 
odd 
odd 
even 
odd 


odd 


even 


187, 213, 


227, 237, 243. 


& 


According to the parity of a, @, a, we have the four cases: 


even even 
odd odd 
even even 
odd odd 
odd even 
even odd 
odd even 


even odd 


' 
| — 


598 NEWMAN A. HALL. 


Applying these results to the number of representations function as given 
above, it is possible to state further: 


N[2%5%7%m = 247 + 35y", (m, 70) = 1] 
2 (— 280/p) 


when 2, %, %, and m are paired according to the divisions above: a), 4); 
hb), 1): ©). 7); d), 6); otherwise the number of representations is zero, 
According to Theorem 12, 


F(m) = (p/9) (p/7T) = > (— 280/p). 


Hence /’(m) is equal to the excess of the divisors of m in classes 1), 4). 6), 7) 
over those in classes 2), 3), 5), 8). 
The formula may be illustrated further by verifying the number of repre- 


sentations of 23902. If 


23902 == + B5y?, 
the empirically obtained solutions are: +11, 26; r= +59. 
yrs 7—+101, y= +10; r— + 109, y= + 2; with all choices of 


sign permissible. Ifence the number of representations is 16. To check this 


with our formula, we observe that 23902 = 2- 17-19-37, hence 


N [23902 = + 35y?] 
= N[2-17- 19-37 227 + 35y?] 
= 2 (u/5) (n/7). 


u/11951 
Since, referring tothe previous notation, 17-19-37 = 11951191 (imod 280), 
we have case b), 1). Furthermore, for each of the prime factors of 1199], 
e() (u/5)(p/7) is seen by reference to 4), 6), and 7) to be + 1. Hence the 
same is true for all factors. In all, 11951 has eight factors: 1; 17; 19; 37; 
37-373 19°37; 17° 19-37. Thus, finally 


N [23902 = 227 + 35y?] = 2-8 = 16, 


to agree with the empirical result. 


QUEENS COLLEGE, 
FLUSHING, NEW YORK. 


| 
( 
: 
h 
Vi 
S 


le 


PARTITION HYPERGROUPS.* ! 


By Howarp CAMPAIGNE. 


1. Introduction. In 1934 F. Marty? and H. S. Wall ® introduced in- 
dependently the notion of hypergroup. They both used this term for a system 
which may in particular be a group, but in which the product of two elements 
is in general a set of elements of the system. Both writers discussed partition 
hypergroups in which the elements are sets of elements of a group. The 
question arises as to whether or not every hypergroup can be represented in 
this way as a partition hypergroup obtained from a group. Marty offered 
the conjecture that the answer is in the affirmative. Wall gave an example 
of a hypergroup which cannot be so represented by means of the special con- 
jugation which he considered. Ji 1s shown in the present paper (section 6) 
that if is not possible to represent this hypergroup by means of any conjugation 
whatsoever among the elements of a group. 

The second main result obtained is a characterization of simple groups 
in terms of a partition hypergroup (section 9). It is shown that a group G 
is simple if and only if a certain partition hypergroup contains no proper 
sub-hypergroups except the identity group. This partition hypergroup is 
outained by means of a conjugation among the elements of G depending on 
its group of inner automorphisms, and the proof depends on the study of the 
lattices of this hypergroup. 

Sections 3, 4, 5 treat of partition hypergroups, the mapping of one hyper- 
group upon another, semi-regular, regular, and commutative hypergroups, 
respectively. In section 7 there are considered examples of conjugations. 

An analogue of the direct product of groups is the subject of section 8. 
Hypergroups which are products of two hypergroups are completely char- 
acterized. Many of the ideas of this section are generalizations of ideas in 
R. Remak’s papers (there cited). 


teceived August 4, 1938; Revised February 21, 1940. 
Presented to the Society April 8, 1938. 

°F. Marty, “Sur une généralisation de la notion de groupe,” Sdrtryck ur Poér- 
handlingar vid Attonde Skandinaviska Matematikerkongressen 7 Stockholm (1934), 
pp. 45-49, 

*H. S. Wall, “Hypergroups,” Bulletin of the American Mathematical Society, 
vol. 4] (1935), p. 36. [Presented at the annual meeting of the American Mathematical 
Society, Pittsburgh, December 27-31, 1934.] 


599 


) 
, 


600 ILOWARD CAMPAIGNE. 


2. Definition of a hypergroup. We consider a system H of elements 
a,,c,- * + in which a product ab is defined for every pair of elements a, b of 
Hi, The product CD of two subsets C,D of H is defined as the set of all 
distinct elements of the products cd as c ranges over C and d over D. The 
system J/ is a hypergroup if it satisfies the following postulates.‘ 


I. If a and 6 are elements of //, then the product ab is a non-vacuous 

subset of distinct elements of /. 

Il. If a, b, ¢ are elements of // then a(bc) = (ab)c. 

IiIJ. There is in 77 at least one element e, called an identity, such that 
for every element 6 in H the products eb and be contain b. 

IV. There is at least one identity e in H such that if b is an arbitrary 
element of H there is in // at least one element b-', called an inverse of } 
relative to e, such that the sets bb and 6b-! contain e. 


It is easily shown that if ° aJ/, b.J/7 then there exist elements 2, y in H 
such that bax, bya. From the definition of a group it follows that if the 
products ab are all single element sets, then // is a group. 

A subhypergroup K of I is a subset of H in which the postulates | to IV 
are satisfied with the law of multiplication of H. If K AH and contains at 
least one identity of H, K is a proper subhypergroup of H. 


3. Definition of conjugation. An equivalence relation, ~, in a hyper- 
group II is called a conjugation if when a1, b.H, ceab, c’ ~ c, then there 
exist elements a’, b’ in H such that a’ ~a, b’ ~b, cab’. If a’ ~a we shall 
say that a’ is conjugate to a. This relation is symmetric, reflexive, and 
transitive. 

If y is a conjugation in H, the distinct residue classes {a} of elements 
conjugate to a form a hypergroup {//},, with respect to the law of multiplica- 
tion which requires that 


fh. 
{Ch 


if and only if there exist elements «’, b’ in H conjugate to a and b, respectively, 
such that c.a’b’.. 1t will be seen that postulates I to 1V are satisfied by this 
system. If e is an identity of // then {e}, is obviously an identity of {H},: 


‘This definition is somewhat different both from that of Wall and from that of 
Marty. Wall’s definition (“Hypergroups,” American Journal of Mathematics, vol. 59 
(1937), pp. 77-98) differs only in one respect, namely, that he requires the product 
ab to have exactly n elements (not necessarily distinct) where n is a fixed intege! 
greater than 0. The definition we have adopted agrees with that of the regular multi: 
group of Dresher and Ore, “ Theory of multigroups,” American Journal of Mathematics. 
vol. 60 (1938), pp. 705-733. 

* The symbol aA is read “a is an element of A.” 


ts 
of 

all 
"he 


DUS 


PARTITION HYPERGROUPS. 601 


it b* is an inverse of 6 relative to e then {b-'}, is an inverse of {b}., relative 
to {e},. We shall call {11}. the partition hypergroup of H relative to the 
conjugation y. 

By the remark near the end of section 2, {//}, is a group if and only if 
when a, b, a’, b’ are elements of H such that a~ a’, b ~ b’, cab, d,a’b’, then 
c~ d. 

We shall say that a subset J of a hypergroup H is appropriate relative 
toa conjugation y in // if J contains with a all the conjugates of a relative 
toy. If J is an appropriate subhypergroup of /7 then y induces a conjugation 
y, in J, and it is easily seen that the partition hypergroup {J},, is a sub- 
hypergroup of {//},. Conversely, if A’ is a subhypergroup of {//}., then the 
set J of all the elements of H contained in the residue classes of K is appro- 


priate relative to y. 


4. Mapping of one hypergroup upon another. We shall consider a 
mapping of a hypergroup A upon a hypergroup & such that the following 


conditions are satisfied. 


(1) Each element a of A is mapped upon a uniquely determined element 
a of M9, in symbols a— a. 

(2) If af then there is at least one element a of A such that a—- a. 

(3) If cab and c>c,a—>a, then 

(4) If aM, bY and cab then there exist elements a,b,c in A such that 


b—>b, and cab. 


It follows from (2), (3) that an identity of A is mapped upon an identity 
of %. If a is an inverse of a relative to an identity e, and a1—>a,, a—> <a, 
e—>e, then it follows from (2), (3) that a, =a is an inverse of a relative 
to e. 

If there exists a mapping of 4 upon Y satisfying conditions (1) to (4) 
we shall say that A is semi-isomorphic with %, in symbols A = Mf. In particu- 
lar A is isomorphic with UM, A = YU, if the mapping satisfies (1), (3), and 
(4). and the following condition stronger than (2): 

(2’) If aM then there is exactly one element a of A such that a—a. 

Isomorphism is reflexive, symmetric, and transitive. Semi-isomorphism 
is reflxive but not symmetric. It is easily seen to be transitive. In fact, if 
P=Q,Q=R, p>q, qr, then it will be seen that the mapping por 
maps P upon Ff in such a way that the conditions (1) to (4) are satisfied, 


and therefore P = Q. 
10 


lat 
i 
he 
at 
ey'- 
all 
nd 
its 
ly, 
iis 
of 
59 
ct 
yer 
ti- 
C8, 


HOWARD CAMPAIGNE. 


THEOREM 4,1. If A and X are finite hypergroups, and A=, A= A, 
then A =X. 


Proof. Wet A,& be of orders yw,v respectively. Since A = % it follows 
that » = v, and since Hence » =v, and therefore A ~ since 
the mapping must be one to one. 

The following theorem may readily be verified. 


THEOREM 4,2. Let {II}, be a partition hypergroup of H relative to the 
conjugation y. If aH, then the mapping a— is a semi-isomor phism, 
so that H = {H1},. 


THEOREM 4.3. Let {H}.,, {11},, be partition hypergroups of H relative 
to conjugations y1, y2 such that if a~b relative to y, then a~b relative to 
Then there exists a conjugation y; in {H},, such that =~ {H},.. 

Proof. Let {a},~ {b},, when a~ b relative to y2. This defines a con- 
jugation in {H},,. The mapping of {{H},,},+, upon 
{H}.,, is seen to be an isomorphism. 

An automorphism of # is an isomorphic mapping of // upon itself. The 
set of automorphisms of H forms a group. There is one case in which a 
subgroup P of this group induces a group of automorphisms in a partition 
hypergroup {H}.,, namely, when the conjugation is preseryed under the auto- 
morphisms of P. If an automorphism p of P maps a upon a’ we shall write 
asa’. Suppose then that whenever p.P, aSa’, bb’, a~b it follows 
that a’ ~ b’. Define a mapping p’ of {J}, upon itself by letting {a}, % {a’'}, 
when a-4a’. This is clearly a one to one mapping of {#1}, upon itself, and 
is easily seen to be an automorphism of {/}.. The set of these induced auto- 
morphisms forms a group Q. It may be shown that P = Q if when p.2’, all, 
a+>b then a+b. 


5. Semi-regular, regular, and commutative hypergroups. A _ hyper- 
group A will be called semi-regular if it contains at least one element s, called 
a scalar, such that if aA then as and sa are single element sets. The set of 
all scalars is called the nucleus of A. Wall® has shown that for his hyper- 
group the nucleus forms a subgroup of the hypergroup, and its identity 1s 
the only identity of the hypergroup. The same holds for the hypergroup 
here considered, and the proof given by Wall holds without modification. 


THEOREM 5.1. Let A, % be two hypergroups such that there is a semi- 
isomorphic mapping a—>a of A upon %. Let E, B denote the subsets of 
elements of A mapped upon an identity e and an arbitrary element b, respec 
tively, of XM. Then % is semi-regular if and only if 


® Wall, loc. cit., in footnote 4, Theorem 4, p. 79. 


< 
| 


PARTITION HYPERGROUPS. 


(1) E is a subhypergroup of A, and 
(2) EBCB, BE CB for every 6. 


Proof. Supposing that % is semi-regular, we shall prove that (1) and 
(2) hold. Let a, 6 be elements of H, and cab. Then a—e, be and there- 
forec—e =e’, that is,c isin If and an inverse of a relative to an 
identity (necessarily in #’), then if a*—> a, we have: e,aa-. = a, = @, 
so that This completes the proof of (1). To prove HBC B, let b> 
for every element b of B. Then if a, and cab we must have c.ab = eb = b 
so that ¢.B. Similarly, BE C B. 

Conversely let (1) and (2) hold. To prove that 9% is semi-regular we 
shall show that e is a scalar. If c.eb then, since A = %, there exist elements 
c,a,b in A such that b.B, and cab. Thus 
and therefore b eb. Similarly, b= be. Since this holds for every 6 in % 
it follows that e is a scalar, as was to be proved. 

If A = and % is semi-regular, then 


CoroLLtary 5.1. The set N of all elements of A which are mapped upon 
elements of the nucleus of Lis a subhypergroup of A. 


A semi-regular hypergroup is called regular if each element has a unique 
inverse with respect to the identity e and if eb implies e,ba. 


TreoreM 5.2. Jf H is a hypergroup: such that eab implies that eba 
for every identily e, then a partition hypergroup {11}. is regular if and only if : 

(1) the identities of IT are all contained in a single class {e},, and the 
set I! of elements in this residue class is a subhypergroup of I; 

(2) if B is the set of elements in any class {b}., then EBC B and 
BECB; 

(3) the inverses relative to all identities of the elements in any class {a} 


are all in one and the same class {a7} y. 


Proof. By Theorem 4.2 the mapping a— {a}, is a semi-isomorphism 
of H upon {H},. Hence by Theorem 5.1 the conditions (1) and (2) are 
necessary for the regularity of {//},. Condition (3) is also necessary. For 
if {4}ye{H} then when {fH}, is regular {a}. must be the unique inverse 
of {a},, and no other class can contain an inverse of an element in the 
class {a}.,. 

Conditions (1) and (2) are sufficient for the semi-regularity of {1}, 
by Theorem 5.1. ‘To prove that (3) implies the regularity of {1}. we must 
show that every element {a}, has a unique inverse. If {€}ye{a}y{b}y (so 
that by hypothesis {¢}ye{a}{b}_), then there exist elements a’, b’ conjugate 


603 
| 
) 

ad 
f 
) 


604 HOWARD CAMPAIGNE. 


to a and 6 such that e,a’b’. Hence b’ is the inverse of a’, and therefore by (3), 
= {a7}, = {b},. 

A hypergroup is commutative if ab = ba for every pair a, b of its ele- 
ments. It is easy to see that if A is commutative and A =, then % js 
commutative. In particular, a partition hypergroup of a commutative hyper- 
group is commutative. 

6. A regular commutative hypergroup which cannot be represented 
as a partition of a group. Such a hypergroup is given by the following table: 


| b 

“e |e b a 
b | b a e,b 
a, b. 


Wall * showed that this hypergroup cannot be represented as a partition hyper- 
group of a group relative to the special conjugation which he considered. We 
shall prove that this hypergroup has a property which no partition hypergroup 
of a group has, namely: it is not inversive. 

A hypergroup is inversive * if when c,ab there exists an identity ¢, and 
inverses a~, b-*, of c, a, b, such that Every group is necessarily 
inversive. We shall prove that every partition hypergroup of a group is in- 
versive. More generally, we have | 


THEOREM 6.1. Jf A = and A is inversive, then WU is inversive. 


Proof. If cab, a2, 6%, and c—>c,a—> a, b > b, cab, then by hypothesis 
there exist inverses b-', relative to an identity e such that 
Let at—a,. Then Now e—e, where is an 
identity of and also that is, 
b, = and Thus so that is inversive. 

In the example above, a.a? but a \ (a')*, so that the hypergroup is not 
inversive. We therefore have: 

THEOREM 6.2. There exists a regular commutative hypergroup which 1s 
not isomorphic with a partition hypergroup of a group. 


7. Examples of conjugations. Other writers® have discussed a con- 
jugation of which the following is an immediate generalization. 


7 Loc. cit., p. 96. 

®This is less restrictive than Dresher and Ore’s reversible in itself. See the 
reference cited in footnote 4, p. 717. 

® Wall, loc. cit., pp. 92-93. Marty, “Sur les groupes et hypergroupes attachés 
a une fraction rationnelle,” Annales de Ecole Normale Superieure (3), vol. 53 (1936); 
pp. 83-123. A generalization is mentioned by Dresher and Ore, p. 720. 


q 


ot 


PARTITION HYPERGROUPS. 605 


Example 1. Let H be semi-regular, and S, T subgroups of its nucleus. 
Let a~ 6 if a = sbt where s,S,t,.7. This defines a conjugation in H. Denote 
the partition hypergroup by {H; 8,7}. In particular, if H is a group, S an 
invariant subgroup, and T the identity group, then {1; 8,7} is the quotient 
group H/S. The partition hypergroup {H; 8,7} is semi-regular if S = 7. 


Example 2. Let H be commutative and inversive and contain an identity 
e such that each element has exactly one inverse with respect to e. Let b~a 
if b=a or ba". This relation is a conjugation in H, and defines a 
partition hypergroup {H}. In this case H ~ {H} if and only if every ele- 
ment of H is self-inverse with respect to e. If H is an Abelian group then 
{H} is a regular commutative hypergroup in which the product of any two 
elements is a set of at most two elements. 


Example 3. We may define a conjugation in an arbitrary hypergroup I 
in terms of any subgroup P of the group of automorphisms of H. Inasmuch 
as the partition hypergroups obtained in this way play an important role in 
a subsequent result, we shall develop here some of their properties. 

The conjugation is defined as follows. Let a~ 6 if a is mapped on b by 
some automorphism of P?. This relation is clearly a conjugation, and so 
defines a partition hypergroup of /7 which we shall denote by {H}p. By 
Theorem 4. 3 we have at once: 


THeorEM 7.1. Jf Q is a subgroup of a group P of automorphisms of H, 
then there exists a conjugation y in {I}q such thal {{H}q}y = {H}p. 


THEOREM 7.2. Jf H is semi-regular (regular) then {H)}p is semi- 
regular (regular). 


Proof. The identity e of a semi-regular hypergroup is mapped on itself 
by every automorphism, and therefore the class {e}p contains only e. Eyvi- 
dently {e}» is a scalar of {/}p, so that {J/}p is semi-regular. 

If H is regular then we must show in addition that every element 
of {H}p has exactly one inverse, and that {e}pe{a}p{b}p implies that 
{e}pe{b}pfalp. If {e}pe{a}p{b}p then there exist elements a’, b’ conjugate to 
a,b such that e,a’b’, and hence the class {b}p contains the inverse of an element 
of {a}p. But if a is mapped on a’ by an automorphism p,, then a is mapped 
on a’ by p,. Since it then follows that 6, that is, 


= {a}p. The regularity of {1/}p follows. 


THEOREM 7.3. If G@ is,a group and P a group of ils automorphisms, 

then {G}p is a regular hypergroup, and 1s a group if and only if P 1s the 
identity group. 


), 
is 
d 
d 
l- 

is 
1 

n 
1 

18 
és 


606 HOWARD CAMPAIGNE. 


Proof. The first part is a corollary to theorem 7.2. To prove the last 
part, let us suppose {G}p is a group if {a}pe{G@}p then {a}p{a"}p = {e}p, 
Since {e}p contains but one element, we must have a’a = e if a’ ~a, so that 
a’ =a, and hence P contains only the identity automorphism. The converse 
is obviously true. 


8. Product hypergroups.’° The product A  B of two hypergroups is 
the set of all ordered couples « X b where a,A, b.B, and where multiplication 
between couples is defined by agreeing that aX be(a, X X b.) if 
Ac,A2, b-b,b,. It is easily seen that A X B is a hypergroup. A subhypergroup 
H of A X B is called a sub-product of A and B if each element of A (and 
likewise each element of B) is represented in at least one couple a X b in H. 
A sub-product of A and B will be denoted by A X B. 

We shall begin by listing, without proof, some of the more obvious 
properties of products and sub-products. 


(1) A X B is a group if and only if A and B are groups. 
(2) AXBZBXA. 


(3) If A, is a subhypergroup of A, then A, X B is a subhypergroup 
of AX B. 


(4) If K is a subhypergroup of A X B, then there exist subhypergroups 
A, and B, of A and B such that K is a sub-product of A, and B,. 


(5) A sub-product A X B is semi-regular only if A and B are semi- 
regular. The nucleus of AX Bisa sub-product of subgroups of the nucleil 
of A and B. AXB is semi-regular (regular) if and only if A and B are 
semi-regular (regular), and its nucleus is MX N, where M and N are the 
nucleii of A and B. 


If B contains an “ idempotent ” element b such that b* = b, then it is 
evident that A X B contains a subhypergroup isomorphic with A, namely 
A X b. The converse is not necessarily so, as shown by the following example. 
Let 7, be the hypergroup of w elements ¢o,t:,- - -,¢u1, where for every 
A, », v we have tyetyt, Let To be a set of a countable number of elements 
ty’ with multiplication similarly defined. 7, has no subhypergroups, and 
no idempotent elements. Yet Toa XK Tu under the correspondence 


1 See Robert Remak’s papers, “itber minimale invariante Untergruppen in der 
Theorie der Endlichen Gruppen,” vol. 162 (1930), pp. 1-16, and “ Uber die Darstellung 
der endlichen Gruppen als Untergruppen directer Produckte,” vol. 163 (1930), pp. 1-44 
of the Journal fiir Mathematik. 


PARTITION HYPERGROUPS. 607 


th’ X ty tyes’. Note that the product Pu Ty = Tp has no subhyper- 
groups. The following theorem gives conditions under which the converse 


is true. 


THEOREM 8.1. Let every descending chain of subhypergroups A > A’ 
DA”): -- of A be finite, and let A contain an identity e such that e is a 
finite set. If AX B contains a subhypergroup isomorphic with A, then B 
must contain an idempotent element. 


Proof. Let Ko be a subhypergroup of A & B such that Ky, = A. Then 
by (4) A, B contain subhypergroups A’, B’ such that Ky = A’ & B’, a sub- 
product of A’ and B’. If A’ =A the argument proceeds as in the next para- 
graph. If A’ AA, then Ky K, =A’, since Ky = A. Therefore by (4), 
A’, B’ contain subhypergroups A”, B” such that K, = A” & BY”. If A” AA’, 
then K, 0 Kk, = A”. Therefore by (4) A”, B” contain subhypergroups 7. al 
BY” such that K, = A” X B’”’. Continuing in this way we get a descending 
chain of subhypergroups A ~ A’ A” -+- - - which must terminate. There- 
fore there is a vy such that A") = A, Without loss of generality we can 
assume that A’ = A. 

Let a— a’ X b’ be corresponding elements under the isomorphism 
AZAX B’=K,. If ay is an identity in A then a)’ X by’ is an identity 
in Ko, and a,’ and by’ are identities in A and B’ respectively. Let a,,a2,° - -, 
y,* * * be the identities of A. Let an, Bn, yn be the numbers of elements 
in the sets a,’*, bn’*, and dy?. Thus a, Bn, yq are positive integers (or in- 
finite) and a8n—-yy- There is a smallest yn, let it be y,. Since #8, = yr 
we have ay). But a is also among the integers yg, since dy’ is an 
identity in A. Thus a = y, and a= yr, so that 8,1. Since by’ is an 
identity, b)’* = 6)’, and B has an idempotent element. 


Corotuary 8.1. Let A X B satisfy the descending chain condition, and 
have an identity e such that e? is a finite set. Then A X B contains sub- 
hypergroups Ay and Bo, isomorphic with A and B respectively, if and only if 
AX B contains an idempotent element, the intersection of Ao and Bo. 


We next consider the question: when can a hypergroup be expressed as 
a product? In order to get an answer to our question we must first consider 
conjugations in product hypergroups, which can always be expressed in terms 
of conjugations in the factor hypergroups, according to the following theorem. 
THEOREM 8.2. Jf A and B are hypergroups with conjugations among 


their elements, then there is a conjugation among the elements of A X B such 
that {A} K {B} = {A X B}. Conversely, if there is a conjugation among 


608 HOWARD CAMPAIGNE. 


the elements of A X B then there exist conjugations among the elements of A 
and of B such that {A K B} = {A} X {B}. 


Proof. Wet aX b~da’ if, and only if, a~a’ and b~b’. The 


mapping {a X b} — {a} X {b} is seen to be an isomorphism. 


8.2. There exists a conjugation in A XB such thal 
{AX B} =A. That is, AX BHA. 


Proof. In A leta~a’ if a=’, and in B let b ~D’ for every b and VU’. 


The following theorem is a direct generalization from standard group 
theory. 


THEOREM 8.3. Necessary and sufficient conditions that a semi-regular 
hypergroup H be the product of hypergroups A and B are: 


(1) A and B are semi-regular and contained in I; 
(2) tf aA and b.B then ab = ba is a single element; 


(3) AB=H, and a,b, = only if a, = a2 and b; = b,. 


Proof. To establish the sufficiency of these conditions consider A X B. 
Each element of H is uniquely representable as a product ab. The mapping 
aX 6-— ab is seen to be an isomorphism. The necessity is easily seen. 


THEOREM 8.4. <A necessary and sufficient condition that an arbitrary 
hyperqroup H be isomorphic with the product of two hypergroups is that there 
be two conjugations y, and y2 in H with the following properties. For every 
paw «and y in H there exists a unique element c such that e~ «x relative 
to y, and c~y relative to yo. If and Yoeyiyr, then Then 


Proof of necessity. In A X B define y, by aX X when a =~’. 
and y: by aX b~a’ X b’ when b =b’. These conjugations satisfy the con- 
ditions above, and {A X B},, = A, {A X B}y, = B. 


Proof of sufficiency. Consider {H},, X {11}, Since the classes {}y, 
‘and {y},, have just one element c in common, the pair {x},, X {y}y. can be 
represented uniquely as {c}., X {c},. The mapping {c},, X {c}y,— is an 
isomorphism of {H},, {H},, with H. 

‘We conclude with conditions under which the product is commutative 


or inversive. 


THEOREM 8.5. A necessary’und sufficient condition that A X B be com- 
mutative is that both A and B be commutative. 


| 
4 


PARTITION HYPERGROUPS. 609 


THEOREM 8.6. A necessary and sufficient condition that A X B be in- 
versive is that both A and B be inversive. 


9. The lattices of a hypergroup. The structure of a hypergroup is 
clarified by studying its lattice of subsets. A lattice is defined as a set € of 
elements A, B, C,- - - such that the following conditions are satisfied. 


(1) For each pair of elements A and B in € there exist elements Av B 
and AB in ©, called respectively the union and intersection of A and B. 


(2) These combinations are commutative, dvB=BvA and AnB 
=BnA, 


(3) They are associative as well, Av (Bu C) = (Av B) vC@ and 
An (BaC) = (AB) AC. 


(4) For each pair A and B we have Av (BOA) = A= (AvB) PA. 


The intersection C * D of two subsets C and D of a hypergroup H is the 
set of all elements common to the two. The union” Cv D is the set of all 
elements contained in products any integer, where zy is an 
element of either C or D. The closed subsets of a hypergroup A form a 
lattice 1* 

If M and B are lattices the set of all pairs of elements of 2% and B8 form 
a lattice >< B, their direct join.’* If 9 is the lattice of the closed subsets 
of a hypergroup A, and 8 is that of the hypergroup B, what is the relation 
between A X B and 2% & B? To answer this we define a plenary subset of 
A X B as one which is the product H X J of its component sets. If H; XK J; 
and H. & J, are two plenary subsets of A X B then 


(H, X J,) v (Is X = (Hv He) X and 
(1, (He K = He) K 9/2). 


The plenary subset // X J is closed under multiplication if and only if both 
H and J are closed. The closed plenary subsets of A X B form a lattice 
isomorphic with QB. 

We next consider the questions, what sublattices does the lattice of the 
hypergroup have, and when is the lattice of {17} among them? Theorem 9. 1 
contributes to the answer of the first part of the question, and the next two 


theorems to the second part. 


™ Dresher and Ore, pp. 714, 715. 

* Dresher and Ore, p. 715, Theorem 2.° 

“Garrett Birkhoff, “On: the combinations of subalgebras,” Proceedings of the 
Cambridge Philosophical Society, vol. 29 (1933), pp. 441-464, Theorem 18. 1. 


ul 
p 
n 
}- 
n 


610 HOWARD CAMPAIGNE, 


THroreM 9.1. Let H be a regular inversive hypergroup. The sel of 
proper subhypergroups of H forms a lattice. 


Proof. If J and K are proper subhypergroups of // then J vu K is closed 
under multiplication, and contains the identity. If 6 is in J vu K then b> js 
in /uK, by the Lemma 1 following. Therefore Jv K is a proper sub- 
hypergroup. 


The common part of J and K, J 9 K, is closed under multiplication, and 
contains the identity and the inverse of each of its elements. Therefore Jo K 
is a proper subhypergroup. 

The operations of union and intersection are commutative and associative, 
Since the intersection of two proper subhypergroups is the largest contained 
in both, and the union the smallest containing both, Jv (kK °/) =/J and 
(Ju kK)»J=J. The following lemma then completes the proof. 


Lemma 1. Jf H 1s an inversive regular hypergroup, then dy 


Proof by induction. Assume the conclusion valid for 7 = 2 and »=p—1. 
It then follows for If then there is an element in 
we have a -+a,;7.. Thus Theorem 9.1 is proved. 


THEOREM 9.2. Let H be a hypergroup with a conjugation among ts 
elements such that {H} is regular and inversive. The proper approprute 
subhypergroups of H form a lattice isomorphic with that of the proper sub- 
hypergroups of {IT}. 


Proof. If J and K are proper and appropriate in H then JK isa 
proper appropriate subhypergroup, since it is closed under multiplication and 
the conjugation and contains all the identities and all the inverses of all its 
elements, as seen in Lemma 2. All the identities are in Ju K. By Theorem 
9.1 {J} v {K} is a proper subhypergroup of {H}, whence by Lemma 2 there 
is a proper appropriate subhypergroup I of H such that {1} = {J} v {K}- 
By Lemma 3, since {J} v {K} contains {J} and {K}, I contains Ju K. If 
i is an element in J then {i} is in a product of the type {z,}{z2}- - - {%}, 
where x, is in J or K. This is only possible if there are elements 2,’, in / 
if tveJ, in K if tyeK, such that - ay’. Thus every element is it 
JK, that is, Jv K =I, a proper appropriate subhypergroup. As before, 
Ju =J and (Jv K) «J and the lattice postulates are satisfied. 
The isomorphism follows from the formulas; {J v K} = {J} v {K}, {Jok } 
= {J} {K}, which in turn follow from Lemmas 3 and 4. 


& 


| 


and 


PARTITION HYPERGROUPS. 611 


LemMA 2. Let H be a hypergroup with a conjugation among its elements 
such that {H} is regular. For every proper subhypergroup K of {H} there 
exists a proper appropriate subhypergroup J of H such that {J} =K. J con- 
tuins all identities of H, and all the inverses of all its elements. 


Proof. Let J be the set of all elements 7 which map upon the elements 
{j} of K. J is appropriate and closed under multiplication. If e is an 
identity in HZ then {e} is the identity of {H}, and therefore e is in J. If 
h is an inverse of h with respect to e then {h-'} is the inverse of {h} with 
respect to {e}. ‘Therefore J contains with h all of its inverses. Thus J is 


proper and appropriate, and {J} = K. 


LemMA 3. If H is a hypergroup with a conjugation among its elements, 
and if J and K are appropriate subsets of H, then J OK if and only if 
(J} {K}. 


Proof. If J 0K and {k},{K}, then kK, and so in .J, and therefore 
{J}. Therefore {J} {K}. If {J} > {K} and k.K, then {k}.{K}, 
whence {K},{J}, and so Therefore J K. 


LemMa 4. The union of two appropriate subsets is appropriate. 


Proof by induction. Let h be an element of the union Ju K of two 
appropriate subsets. Then h is contained in a product 222° + - Xp, where 2p 
isin either J or K. Let h’ ~h. If » = 2 then there exist 2,’ ~ 2, 22’ ~ 22’ 
such that h’.,’r.’. Suppose that for »=v—1 there exist 2 ~ 
n=1,2,--+,v—1, such that Then a similar statement 
holds for For implies that there is an element 
such that h.bay. If h’ ~h then there exist b’ ~ b, 2,’ ~ zy such 
that h’,b’ay’.. By hypothesis there exist 2’ 7 =1,2,- -,v—1, such 
that - Therefore h’.b’ay’ C Since J and K 
are appropriate we have implies and implies Therefore 
JK is appropriate. The proof of Theorem 9.2 is now complete. 

Let G be a group with a conjugation among its elements. If H is a 
subset of {G@} which is closed under multiplication, and if the set D in G 
which maps upon H is finite, then H is a subhypergroup of {@}. For D is 
closed under multiplication, and being finite, is an appropriate subgroup of G. 
Therefore H = {D} is a subhypergroup of {G@}. 

If G@ is finite then the subhypergroups of {@} form a lattice ©. For the 
closed subsets of {G} form a lattice, and by the paragraph above each closed 
subset is a subhypergroup. Since, if D and F are appropriate subgroups of G, 


of 
sed 
ub- 
ive, 
nd 
(ly 
1. 
in 
its 
ile 
a 
nd 
its 
om 
re 
If 
J 
in 
e, 
d. 


612 HOWARD CAMPAIGNE., 


{Du F} = {D} v {F} and {DoF} = {D} {F}, the appropriate subgroups 
of G form a lattice isomorphic with ©. 

Let H be a regular hypergroup with a conjugation among its elements 
such that for every element b we have b*~b. Let J be a subset of {//} 
which is closed under multiplication, and K be the set of elements of I which 
map upon the elements of J. Therefore A’ is closed under multiplication and 
the conjugation. With b it contains b-', and so it contains e. Therefore K 
is an appropriate proper subhypergroup of //. 

If J and are subhypergroups of H then {Ju L} = {J} u{L} and 
{Jo L} = {J}«{L}. Thus we have 


THEOREM 9.3. Let H bea regular hypergroup with a conjugation among 
its elements such that for every b,b~b". Then the subhypergroups of {H} 
form a lattice isomorphic with that of the appropriate subhypergroups of H. 


We conclude with a condition that a group G be simple. Let S be a 
subgroup of the automorphisms of G. Let b~c in G when b6—c under an 
automorphism of S. Now {G@}z is a finite regular inversive hypergroup, and 
the set of appropriate subgroups of @ is a lattice isomorphic with that of the 
proper subhypergroups of {G@}s. A subgroup F' is appropriate if and only 
if it is characteristic under 8. If S is the group of inner automorphisms of ( 
then the appropriate subgroups of G are the normal subgroups, and the lattice 
of the proper subhypergroups of {G}gs is a B-lattice.* We thus have the 
following theorem: 


THEOREM 9.4. If S is the group of inner automorphisms of the group G. 
then G is simple if and only if the partition hypergroup {G} has no proper 
subhypergroups except EF, the identity group. 


UNIVERSITY OF MINNESOTA. 


14 Birkhoff, loc. cit., Section II. 


or 


wh 


th 


pe 
the 
su 
re] 
str 
Th 
in 
tio 
alt 
pr 
alr 
fu 
mi 
din 
(E 
per 


ON THE ALMOST PERIODIC BEHAVIOR OF MULTIPLICATIVE 
NUMBER-THEORETICAL FUNCTIONS.* 


By E. R. van KAMPEN and AUREL WINTNER. 


The purpose of the present paper is to develop criteria for the almost 
periodic behavior (B*) of multiplicative number-theoretical functions. In 
the particular case of what have been called strongly multiplicative functions, 
such criteria were recently’ found for A=1 and A= 2. However, the only 
representative of the classical number theoretical-functions in the class of 
strongly multiplicative functions is ¢(n)/n, where ¢@ is Euler’s function. 
Thus, there arises the question as to the possibility of a, corresponding theory 
in the general case. 

It will turn out that such a theory can be developed, although the situa- 
tion then is essentially more involved. In fact, already the question of the 
almost periodicity (B) of the factor functions, which belong to each of the 
prime numbers, must be discussed. Correspondingly, the preservation of 
almost periodicity (B‘) on multiplication of a finite number of such factor 
functions requires especial care. The limit process which leads to the given 
multiplicative function is formally more involved than, though in principle 
uot different from, the corresponding step in the strongly multiplicative case. 

The results to be obtained may be illustrated by the sum, o(n), of the 
divisors of n. The result in this case will be that o(n)/n is almost periodic 
(B’) for arbitrarily large A and has the Fourier expansion 

a(n) Cm(n) 


2 


n 


where the c’s denote the Ramanujan sums. But Ramanujan? has proved that 


Tv m=1 


so that, if one divides by n, Ramanujan’s trigonometric series turns out to be 
the Fourier series of the function to which it converges. 
That Ramanujan’s results do not imply any almost periodic behavior may 


* Received November 15, 1939. 

1M. Kac, E. R. van Kampen and Aurel Wintner, “ Ramanujan sums and almost 
periodic functions,” American Journal of Mathematics, vol. 62 (1940), pp. 107-114. 

*8. Ramanujan, Collected Papers (Cambridge, 1927), pp. 179-199. 


613 


614 E. R. VAN KAMPEN AND AUREL WINTNER. 


be illustrated by the following example: Ramanujan proves that if d(n) 
denotes the number of divisors of n, then 


oo 


But this convergent trigonometrical series cannot be the Fourier series (B) 
of the function, d(n), which it represents. In fact, 


d(m) ~ log n, ©, (Dirichlet) 
implies that the mean value of d(n) is + «©; s0 that d(n) cannot be almost 
periodic (B). 

Incidentally, the results to be obtained are, in contrast to the results of 
Ramanujan, independent of the prime number theorem. 

It should be mentioned that, while Theorems IV and VI below may, 
with straightforward modification of proof and wording, be transferred to the 
case where multiplicative functions are replaced by additive functions, essen- 
tial complications seem to arise in connection with the corresponding analogue 
to Theorem V below (if 41). 

By a function f(”) will be meant a sequence in which n runs through 
all positive integers, The average M{f} — M{f(n)} of an f is defined as the 
limit (n— ©) of the arithmetical mean of the n numbers f(1),-°- -,f(n), 
if this limit exists. And M{f}—= M{f(n)} will denote the upper limit 
(=-+ ) of this arithmetical mean, if f= 0. 

By a multiplicative function f(n) is meant a sequence f(1),f(2),° °° 


of numbers for which 
f(mnz) =f(m)f(n2) whenever (m, 2) =1; hence, f(1) =1 


unless f(n) = 0 for every n (this trivial case will be excluded). 

If there exists a fixed prime number p* such that f(p*) =1 for every i 
and for every prime number p distinct from p*, the function f(n) will be 
called a prime multiplicative function (belonging to the prime number p*). 
It is clear that if pm denotes the m-th prime number and fm(n) an arbitrary 
prime multiplicative function belonging to pm, then 


(1) f(n) = fn(n) 


defines a multiplicative function f(m), all but a finite number of the factors 
of the infinite product being 1 for a fixed n. Conversely, every given multi- 


plic 
for 
(3) 
Cor 
to 
yan 
the 
(4) 
in 4 
(5) 
whe 
g(1 
prin 
(6) 
and 
(7) 
Con 
m=1 sum 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 615 


plicative function f determines a unique sequence {fm} of prime multiplicative 
functions fm by means of which f is representable in the form (1). In fact, 


(2) fm(n) =f(pm*) if pm*|n but (f(1) =1). 


In what follows, g(m) will denote an arbitrary function which satisfies, 
for a fixed prime p, the requirement that 


(3) g(n) = 9(p*) if p*|[n but 
Condition (3) is satisfied by every prime multiplicative function belonging 
to p, but not only by these functions; in fact, (3) is possible also when g(n) 
vanishes for n 1, without vanishing for every n. 

THEOREM I. The average M{qg} of a function (3) exists if and only if 
the serves 
(4) = is convergent, 

k=0 

in which case 


Remark. It will be clear from the proof that (5) holds, if g = 0, also 
when the series (4) is divergent (in which case M{g} —=-+ «). 

Proof. Let a; denote the arithmetical mean of the p* numbers 
g(1),- - -,9(p*), where @ is an arbitrary non-negative integer and p the 


prime number belonging to g. Then, by (3), 


(6) 
Hence, for every i > 0, 
— dia = —g(p*")); =g(1), 
and so 
k=1 


Suppose first that M{g} exists. Then, in particular, 
a, —> M{g}, and so a; as i> 


Consequently, application to (7) of a standard lemma concerning linear 
summation methods shows, that p-‘g(p') >0 as i>. Hence, (4) and 
(5) follow from (6). 

In order to prove the sufficiency of (4) for the existence of M{g}, let 


616 E. R. VAN KAMPEN AND AUREL WINTNER. 


n= 3 qip', where 0= qi < p; 
4=0 


be the p-adic representation of an arbitrary n >0. Then, by (3) and (6), 


m 


1: 
qip' 


It follows, therefore, from the standard lemma on linear summation methods, 
used before, that in order to prove the existence of M{g}, it is sufficient to 
assure that the sequence (which merely is a subsequence of the 
sequence defining M{qg}) has a limit A + o. But (6) and the assumption (4) 
clearly imply the existence of lim ai(4 + ©) ; so that the proof is complete. 


THEOREM II. A function g which satisfies (3) (and so, in particular, 
a prime multiplicative function belonging to a prime p) is almost periodic 


= 
(B) for a given positive A(=1) if and only if * 


(8) p*! g (ph) < @. 
k=0 


(This implies, for A — 1, the curious fact that g(n) is almost periodic (B) 
whenever so is | g(n)|). 

It is understood that if A< 1 in Theorem II (so that there is no Holder- 
Minkowski inequality and, correspondingly, no natural metric in the B»-space), 
then M{g} need not exist, and so, in particular, g(m) need not have a Fourier 
expansion. 

Proof. If g(n) is almost periodic (B), so is | g(n)|; so that M{| 9 |} 
exists. Consequently, application of Theorem I to the function | g(7) |* shows, 
that (8) is a necessary condition for the almost periodicity (B) of g. 

In order to prove the converse, define, in terms of any given function 4, 
for every positive integer 7 a function g’, by placing 


(9) gi(n) =g(n) if LSnSpi, gi(n+ = gi(n) for every n. 


Then it is clear that (3) remains valid if one replaces g(n) by the non- 
negative function | g(n) —g/(n)|* of nm for a fixed j. Hence, the Remark 
which follows Theorem I implies that 


* This result is closely related to a construction due to O. Toeplitz, “ Ein Beispiel 
zur Theorie der fastperiodischen Funktionen,” Mathematische Annalen, vol. 98 (1927), 
pp. 281-295. 


in 


an 


pr 
an 
y 
fo 
on 


S 
pe 
(J 
(1 
(1 
an 
(1 
In 
CO) 
a 
(1 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 617 


On letting here 7 > «, one sees from (9) that, if (8) is satisfied, 
0 as joo. 


Since every g’ is, by (9), a periodic function of n, it follows that g is almost 
periodic (B*). This completes the proof of Theorem IT. 


THEOREM III. A function g(n) which satisfies (3) is almost periodic 
(B) if and only if 


(10) 3 < 


in which case the Fourier expansion of q(n) is 


(11) g(n) ~ M{g} + & (n) 3 
j=1 k=j 
where the constant term is 
oO 
(12) M{g} = (1—p") pg by (5), 


and the cyi(n) denote the Ramanujan sums belonging to those indices m which 
are powers of p: 


13 m = (2m —: 


= > cos md n, where 1=k=~m and (k,m) —1. 
k m 
In particular, g(n) is limit-periodic (grenzperiodisch), since the Fourier 
constants belonging to (circular) irrational frequencies all vanish. 
Remark. Assuming that (8) is satisfied for A = 2, one sees from (12) 
and (13) that the Parseval relation belonging to (11) is 


ay 


k=0 
1 — > (pi — pi-} > 
( 3): o (7 k=j 


an identity which can, of course, be verified directly. 


Proof. Since (10) is the particular case A = 1 of the criterion (8) of 
Theorem II, only the explicit form (11) of the Fourier expansion needs a 
proof. To this end, one can readily verify from the definitions (12), (13) 
and (9), that (11) is certainly true if g is replaced by the periodic function 
gy’ (where j is arbitrarily fixed) ; in fact, (11) follows for g = g/ by straight- 
forward trigonometrical interpolation. Since, by the proof of Theorem II, 
one has M{| g—g/ |} 0 as it follows that (11) holds for any 9. 


11 


k=0 
k-0 


618 E. R. VAN KAMPEN AND AUREL WINTNER. 


Since the preceding results concern an arbitrary function which satisfies 
(3), they are applicable to every prime multiplicative function, and so to any 
of the factors (2) of an arbitrary multiplicative function (1). In what fol- 
lows, there will be established natural analogues to Theorems I-III for the 
case where prime multiplicative functions g(n) = fim(n) are replaced by 
arbitrary multiplicative functions f(”). It seems to be hard to replace 
Theorems IV, V, VI below by theorems which are of the same sharpness as 
the corresponding Theorems I, II, III above. 

It will be convenient to associate with every multiplicative function f(n) 
another multiplicative function f-(”), which is defined by 


_ 1) — 


(15) 
Then, since also f() is multiplicative, 


(16) f(n) df.(d). 


THEOREM IV. The average M{f} of a multiplicative function f(n) 
exists whenever 


(17) fe(n)| << 
in which case 
(18) M{f} = % f-(n). 


Remark. It will be clear from the proof that (18) holds, if f«(n) 20, 
also when the series (17) is divergent (in which case M{f} —=-+ ©). 
Proof. It is seen from (16) that, for every n = 1, 
n n 
f(t) = [2] 
k=1 k=1 v 
Hence, 


f(ke) =n fe(k) + O(3 | 
as oo. Since (17) implies that = Sk | fe(k)| 0, it follows that 
L 


f(ke) =n & folie) + 0(n). 


This completes the proof of Theorem IV. 


It will be convenient to extend the class of an arbitrary multiplicative 
function f in the same way as condition (3) extends the class of prime 


(| 


al 


] 
{ 
] 
\ 
a 
n 
a 
te 
nh n 
ft 

n n 
k=1 k=1 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 619 


multiplicative functions. The extension in question may be defined by the 
requirement that the given f(n) admits of a factorization (1) in which a 
factor f,, need not be multiplicative but merely such that condition (3) is 
satistied by g and p= pw, where m Since fm(1) need 


not be 1, it must, of course, be assumed that the product IL fm(1) is con- 
m=1 


vergent and remains convergent if one omits a finite number of its factors. 
if these conditions are satisfied, f will be called a generalized multi- 
plicative function. For the proof of Theorem V below, those and only those 
generalized multiplicative functions will be needed for which f,(1) = 1 holds 
for every m with the exception of one value, say m =r, for which f,(1) = 0. 
For a generalized, multiplicative function f, let f+ denote the generalized 
multiplicative function 


(19) == IT fine(m), where fine(1)= fin (1), fine (*) = fu(P*) 
m=1 


and every fine is prime multiplicative. It is clear that this definition of f- 
reduces to the definition (15) if the generalized multiplicative function f is 
multiplicative. 


THEOREM IV bis. Theorem IV holds for generalized multiplicative 
functions f also. 


This is readily seen from the proof of Theorem IV and from the 
definitions of the generalized multiplicative functions f, f 
The following considerations will be based on an auxiliary lemma. 


LEMMA. The product 


m 


(20) f(r) = 


of a finite number of prime multiplicative functions fi,° ++, fm which belong 
to distinct prime numbers pm is almost periodic (B) for a fixed 
A=1 whenever each of the functions fm is almost periodic (B*) for 
this 


Proof. For every r(=1,- -,m), define two prime multiplicative 
functions u,, v- of n by placing 


vr(n) = Max (1, | f-(")]), ur(n) =fr(n) /er(n) 
so that 
| we(n)| S1S0,(n) 
and 


fr(n) = ur(n)v,(n). 


620 E. R. VAN KAMPEN AND AUREL WINTNER. 


Since uy is a bounded function of », criterion (8) of Theorem IT is satisfied 
by g =u, for every exponent, and so u, is almost periodic (B*) for every x. 
On the other hand, since f; is supposed to be almost periodic (B*) for a fixed 
A, Theorem II assures that (8) is satisfied by g =f, and so, in view of the 
definition of vr, by g=v, also; so that vy is almost periodic (B*) by 
Theorem II. Since the product f of the m functions f, is the product of the 
m-+ m functions ur, vr, where the u, are almost periodic (B*) for arbitrarily 
large x, and since A= 1 by assumption, it follows that the proof of the Lemma 


will be complete if one shows that the product 


(21) = IL v,(n) 
r=1 


of the m almost periodic (B*) functions 7, is almost periodic (B). And this 
may be shown by induction from m to m + 1, as follows: 

Suppose that v is already known to be almost periodic (B). Let w bea 
prime multiplicative function which belongs to a prime number p distinct from 
the primes! p;,° °°, Pm to which the factors 1,,---,vm of v belong. Suppose 
finally that w is (as are these factors v, of v) not less than 1 and almost 
periodic (B). The induction from m to m + 1 requires to prove that the 
product vw is almost periodic (B*). To this end, let wi =wi(n) denote the 
function which one obtains by applying the definition (9) to g = ww, where 
j=1,2,---. Then w’ is periodic, hence almost periodic (B*) for every x, 
and so the product vw/ is almost periodic (B*), since A= 1. Hence, the proof 
of the almost periodicity (B‘) of cw will be complete if one shows that 


M{di} as joo, 
where di = d/(n) denotes the function 
di =| vw — |>; (j =1,2,---). 


But it is clear from the definitions of v, w and w/, that d/ is for every fixed / 
a generalized multiplicative function in the sense defined before Theorem 
IV bis. Hence, by Theorem 1V bis, it is sufficient to show [cf. (17)-(1%) | that 
(22) x | di.(n)| 
n=1 
is convergent for a fixed 7 and that 
| M{d;} | =| & di.(n)| S & | di(n)| 30 as joe, 
n=1 n=1 


where d/.(n) is obtained by applying the definition (19) to f= di. And this 


may be proved as follows: 


| 
if 
if 
te 
it 
tr 
m 
pe 
is 
pl 
It 
sa 
th 
(2 
th 
its 
q 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 621 


Since and w are prime multiplicative and almost periodic 
(B), Theorem IT assures that condition (8) is satisfied by any of the m + 1 
functions y=, v;,° + *, Um; so that, by the definition (19) of the asterisk 
symbol. 

(23) | d.(n)| << 0, 
n=1 


if d? =d°(n) denotes the multiplicative function 
d° == = Umw); ef. (21). 


But it is clear from (19) and from the definitions of w/ and d/, where j > 0, 
that the series (23) is a majorant of (22), also if one replaces by zeros those 
terms of (23) which are not divisible by the (j + 1)-th power of p. Hence, 
it is clear from (23) that the sceries (22) is convergent for every 7 and has 
a value which tends to 0 as] > 2. 


This completes the proof of the Lemma. 


Remark. It may be mentioned that also the converse of the Lemma is 
true, ij.e., that for the almost periodicity (*) of a finite product of prime 
multiplicative functions belonging to different primes it is not only sufficient 
but also necessary that each of these prime multiplicative functions be almost 
periodic (B\), where A = 1 is arbitrarily fixed. This converse of the Lemma 


is not needed in what follows, so that the proof will be omitted. 


THreoreM V. Jf X21 is fixed and f is a real non-negative multi- 


plicative function, then f is almost periodic (B‘) whenever 
(24) | < 0. 
n=1 


It is understood that by f+ is meant the function which belongs to f* in the 
same way as the multiplicative function fs defined by (15) belongs to f, and 
that f’ = f(n) denotes the A-th power of f = f(n). 


Remark. In view of the Remark which follows Theorem IV, condition 


(24) is necessary as well for the almost periodicity (B*) of f in case fe = 0. 


Proof. It is clear from the definition (19) that if (24) is satisfied by 
the non-negative multiplicative function f, then it is also satisfied by each of 
its prime multiplicative factors fn = 0. Thus, condition (8) is satisfied by 
9=f,,fe.: ++, and so Theorem II assures the almost periodicity (B*) of 
any of these prime multiplicative functions. It follows therefore from the 


preceding Lemma that the multiplicative function 


622 E. R. VAN KAMPEN AND AUREL WINTNER. 


m 


(25) lin (n) = IE 


of n is almost periodic (B\) for every m. Hence, in order to complete the 


proof of Theorem V, it would be sufficient to prove that 
(26) M{| >} as 


However, it will be convenient to carry out the limit process leading from 


(25) to (1) in another manner, namely by means of a pair of auxiliary func- 


tions u=u(n), vu =v(n) which are defined in terms of the given function 
f=f(n) as follows: 
Both functions u(m), are multiplicative, and 


(27) u(p*) = Min (1, f(p*)); v(p*) = Max (1, f(p")) 


for every prime p and for every k=1. Since f 20 by assumption, it is 
clear from (27) that 


(28) 
while 
(29) f=w 


for every n. Furthermore, from (27) and (15), 
S| f-|- 
By wn and vy» will be meant the prime multiplicative functions which belong 


to the m-th prime number p =p» in the same way as fi, in (1) and (2) 
belongs to f; so that 


(30) | uw | S| fe]; | 


(31) u(n) Un 5 v(n) Um(n). 

It is clear from (30) that the assumption (24) remains satisfied if one 
replaces f by w or by v. Since it was shown before that each of the partial 
products (25) of the infinite product (1) is almost periodic (B) if f satisfies 
(24), it follows that each of the partial products of either of the infinite 
products (31) is almost periodic (B*). Hence, in order-to prove that wu and 1 
are almost periodic (B), it is sufficient to show that 


(32) M{|u—an |} 0; ox, 
where Ym = denote the multiplicative functions 


(33) Cn = II Ur 5 Ym = I Ur. 


r=1 r=1 


But if both functions u,v are known to be almost periodic (B»), then also 


C 


= 

r=1 
| | 
| 
a 
( 
| t 
| 

a 
me (3 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 623 


their product is almost periodic (B*), since u is bounded, by (28). It follows 
therefore from (29) that the proof of Theorem V will be complete if one 
verifies (32). 

In order to prove (32), notice that, by (27), (28) and (31), 


(34) 


for every n. Furthermore, (30) and (24) assure that (17) is satisfied by 
any of the functions f = w, v’, a, yu; so that, by Theorem IV, the average 
of any of these functions exists. And these averages satisfy, in view of (18), 
the limit relations 

(35) M {2} > M{w ; M as 

But it is clear from (34) and (35) that the proof of (32), hence also the 
proof of Theorem V, will be complete if one verifies the following elementary 
lemma (which has nothing to do with multiplicative functions) : 

Lemma. If there exists a finite average M{f} for the A-th power (A = 1) 
of each of the real non-negative functions = F(n); Fi (n), 
and if, for every fixed m, 

(36) either F#,,(v) S F(x) for every n or Piy(n) 2 F(n) for every n, 


then the limit relation 


(37) M{F,} MiP}, ©, 
implies that 
(38) M{| F—F,, 0, 


In order to prove this Lemma, notice that, since A = 1, 
(1—1)\<1—P for OS¢tS1. 
Hence, it is clear from (36) that 
| F— Pan = | — F,, |. 

Consequently, (38) follows from (37) in view of (36). 

This completes the proof of Theorem V. 

Theorem V may be refined by exhibiting a sequence of almost periodic 
functions (8) which are explicitly defined in terms of f and tend to f with 
reference to the metric of the space (B): 


THEOREM Vbis. Jf a real non-negative mulliplicative function f satisfies 
(24) for a fixed X= 1, then the products (1), (25) are almost periodic (B*) 
and satisfy (26). 


Proof. Jt was shown in the Proof of Theorem V that each of the real 


624 E. R. VAN KAMPEN AND AUREL WINTNER. 

non-negative functions hn; am,ym3; u,v; f of n is almost periodic (B>), 

Furthermore, (25), (27) and (33) imply that hm =amym; so that, by (29), 
f == (u— &m)v + (v — ym) am. 


Hence, if A > 1, it is seen from Minkowski’s inequality that in order to prove 
(26), it is sufficient to show that 


M{\(u- |} 0; M{|(v- Ym) Lm |\} > 0, (m—> 


where w; 2, #2,°** are, by (28) and (34), uniformly bounded functions of 1. 
Consequently, if A > 1, the assertion (26) is, in view of Hélder’s inequality 
and of the almost periodicity (B) of v, implied by 


M{|u— an |} > 0; M{| v— ym |} > 0, (m—> ~), 


a pair of relations which obviously imply (26) in the limiting case A= 1 
also. Since this pair of relations is, in view of the uniform boundedness of 
the functions w—a,, - of n, equivalent to (82), the proof of 
Theorem V bis is complete. 


THEOREM VI. A multiplicative function f 20 is almost periodic (B) 


whenever 
fo 8) 

(39) | fe(n)| < 
m=1 


where fe(n) is the mulliplicative function defined by 


pt) — 


p 
The Fourier expansion of f then is 
(40) f(r) ~ & amem(n), where dn == f+(ml), 
m=1 


tm(n) denoting the Ramanujan sum (13). In particular, f is limit-periodic 
{grenzperiodisch ). 


Proof. It is clear from the assumptions of ‘Theorem VI that the assump- 
tions of Theorem III are satisfied by each of the prime multiplicative functions 
g =f:,fe, * * which occur in the factorization (1) of f. It follows therefore 
from (11) and (12) that (40) is true if f is replaced by any of the functions 
fi, fe,: * +. Since (13) is known to be a multiplicative function of m for 
every fixed n, it follows that the series (40) belonging to the finite product 
f=hn fm may be obtained by a formal multiplication of the m 
Fourier series (40) which belong to f = f;, f2,- --, fm, respectively. But the 
resulting formal product is identical with the Fourier series of the function 


Bu 


ane 


hn 
| wl 
pr 
Y- 
(w 
a 
of 
(4 
| wh 
sh 
Br 
(4: 
by 
ser 
rec 


ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 625 


hn = fifz* ++ fm, as seen by a repeated application (for A 1) of the Lemma 
which precedes Theorem Vbis. Consequently, (40) is true if the infinite 
product (1) is replaced by the finite product (25), where m is arbitrary. 
Hence. on applying (26) for A = 1, one sees that (40) holds for the infinite 
product (1) also. This completes the proof of Theorem VI. 

The above results will now be applied to the case of classical multi- 
plicative functions, involving Euler’s ¢-function and the sum, oa(n), of the 
a-th powers of the divisors of n; so that 

(41) = d*, 1.@., = 
(where it is understood that o)(j*) denotes the limit (= 4+ 1) of oa(p*) as 
a—>-+ 0: so that the function o (nm) represents the number of the divisors 
of n). 

For sake of shortness, a function will be called almost periodic (B®) 
if it is almost periodic (B*) for arbitrarily large 4. 

(i) The multiplicative function f(n) =aa(n)/n* is almost periodic 
(B®) for every « > 0 and has the Fourter expansion 
(42) g(a 41) 3 


m=1 m ere 


(2 >0); 


while M{o,} = -+ o. 

Proof. In order to prove the almost periodicity (B™), it is sufficient to 
show that the criterion (24) of Theorem V is satisfied for every positive A. 
But if then, by (41), 


prok+a __ 1) X ( _ 


so that f 1 )Aparerk ? 


(p*—1) 
by (15). Hence, it is easy to see that f > 0; so that, on applying to the 
series (24) the Euler factorization, one sees that thé criterion of Theorem V 


(43) f\( pF) = 


requires that 


ak+a __ ar 
Il (p E < « forevery A> 0. 
p k=0 (p* 1) 


But this product of sums may be rewritten as 
(1— pe (1—p*)> (p—1) 


II =II 


(1— pict (1— pr) 


and is therefore of the form 


P 
(1 * O(p™*) + O(p)). 
q 


626 E. R. VAN KAMPEN AND AUREL WINTNER. 


And this product is absolutely convergent for every A> 0 and for every 


% > 0, since < o for every > 0. 


This completes the proof of the almost periodicity (B®) of (41) for 
every @ >0. Since application of (43) for A=1 shows that f«(j*) = po, 


one has, for m -, 


so that (40) reduces to (42). 
The calculations involved become shorter in case of Kuler’s ¢-function, 
in which case the result is as follows: 


(ii) Both functions (n)/n, n/p(n) are almost periodic (BY). Their 
Fourter expansions are 


where the summation index q runs through the quadratfrei positive integers, 


Proof. Since ¢(p*) = p* — p', application of the definition (15) to the 
A-th powers of the respective functions f(m) =@(n)/n and f(n) =n/d(n) 
gives 


(45,) f(p) = f(p') =0 if k>1, 
(45,) p(p) =F = 0 if 


Hence, the criterion (24) of Theorem V requires, for a fixed A, that 


(1+ | A,(A)|) < ie, that | Ap(A)| < 
where 
A,(A) = and A,(A) 
respectively. Since it is clear from that < holds 


for every A > 0 in both cases, the almost periodicity (B~*) follows. Finally, 
application of (44,) and (44.) for A=1 gives 


respectively ; so that (44,) and (44.) readily follow from (40). 


THE JOHNS HOPKINS UNIVERSITY. 


p(n a £(3) pig i—p’ 


ad 


b 
( 
| 
( 
| 
r 


'y 


ON UNIFORMLY ALMOST PERIODIC MULTIPLICATIVE AND 
ADDITIVE FUNCTIONS.* 


By E. R. van KAMPEN. 


In this note conditions are established for certain multiplicative or 
additive functions to be uniformly almost periodic? (u. a. p.). 

A subscript p on the symbols If or & will denote a product or sum over 
all primes, except that sometimes (explicitly) a finite number of primes will 
be excluded. 

A multiplicative function f is an arithmetical function f=f(»), 
n==1,2,: which satisfies 


(1) f(mne) = f(n)f(n2) if m2) 1, [f(1) =1]. 


Such a function may be written in the form 


(2) f(n) = (n 1, 2,- ‘), 
where fp(n) is for a fixed prime p defined by 
(3) fp(n) = f(p") if p*|n and ph fn. 


The product in (2) clearly is a finite product for every fixed n. 
An additive function g is an arithmetical function g = g(), n =1,2,°°-, 
which satisfies 


(4) + if (m,n) = 1, [g(1) =0]. 
Such a function may be written in the form 
(5) g(n) = (n = 1, 


where g,() is, for a fixed prime p, defined by 
(6) go(n) = g(p") if p*|n and p fn. 


Thus the sum in (5) is a finite sum for every fixed n. 
The main result may be formulated as follows: 


THEOREM 1. An additive function g(n) [real-valued multiplicative func- 
tion f(n)] is u.a. p. if and only if the sum representation (5) [the product 

* Received November 30, 1939. 

1Conditions for such functions to belong to a class (3B) of Besicovitch almost 
periodic functions are investigated in: E. R. van Kampen and Aurel Wintner, American 
Journal of Mathematics, vol. 62 (1940), pp. 613-626. 


627 


628 E. R. VAN KAMPEN. 


representation (2)] is uniformly convergent and each summand gp [factor fy] 
is U. Pp. 


The sufficiency of the above conditions is, of course, evident from the 
elementary properties of u. a. p. functions, also for complex valued multiplica- 
tive functions. On expressing the conditions in analytical terms, one obtains 
the following theorems: 


THEOREM 2. An additive function g is u.a. p. if and only if the series 


(7) Spl. u. bx | g(p*)| is convergent, 
and the limit 
(8) = lim g(/*) exists for every prime p, (k— w). 
It will be clear from the proof that if g() is u.a.p., then the unique u. a. p. 
extension of g(n) to non-positive values of nm may be obtained by placing 
g(—n) =g(n) and g(0) == 

THEOREM 3. A multiplicative function f is u.a.p. if, and in case of 
real-valued functions only tf, the series 


(9) Xp l.u. bx | f(p*) -— 1 | is convergent, 
and the limit 
(10) Bp = lim f(p*) exists for every p, (kw). 


And one sees easily that the unique u. a. p. extension of f(n) for non-positive 
n may be obtained by placing f(-—”) and f(0) = 

The proof of Theorem 2 is simpler than the proof of Theorem 3. One 
could reduce Theorem 2 to a special case of Theorem 3 by considering the 
multiplicative functions exp (#@g(n)) and exp (&g(n)), where g(n) is 
additive. The, apparently difficult, complex case will be reduced in Theorem 5 
to the case of additive functions modulo1. This reduction depends on the 
following theorem on u. a. p. arithmetical functions h(n) of absolute value 1. 


TueorEM 4. If the funclion h(n) is u.a. p. and satisfies | h(n)| =1, 
n=1,2,---, then there exist an integer P, real numbers cy and real-valued 
u.d. p. functions wu, such that 

h(w+ nP) = exp + ) ; 
(u==1,---,P; n=1,2,- °°). 

Theorem 4 is a special case of a known theorem concerning generalized 
almost periodic functions on groups.” 


2E. R. van Kampen, Journal of the London Mathematical Society, vol. 12 (1937), 
pp. 3-6; the result needed is (2), as modified by the last remark on p. 4. Since this 


th 


th 
loy 


an 


(1 
th 
(1 
{ If 
be 
of 
of 
th 
| 
ta 
ar 
ne 
th 
jec 
Sel 
wl 
fu 
ob 
ap 
the 


UNIFORMLY ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 629 


Use will be made of the following lemmas: 

I. If the function $(n) satisfies for a fixed prime p the requirement 
(11) p(n) =(p*) if and (k& =0,1,2,:--), 
then is p. of and only if 
(12) = lim $(p*) exists, (k— 


Clearly this lemma * is applicable both to the summands gp of g and to 
the factors fp of f. 

Let $; denote, for = 1, 2,- the periodic functions defined by 

oj(n) = o(n) if p/) =¢;(n) for every n. 
If (12) is satisfied, then > uniformly with respect to n as «, 
so that @(n) is u.a. p. In order to prove the converse, assume that @(n) is 
wa.p., and let »—y(/) denote the translation function of ¢(7), i.e., let »(71) 
be for a fixed 7 the least upper bound of the function | ¢(n +1) —¢(n)| 
of n(=1,2,---). It is clear from (11) that »(1) is the least upper bound 
of the expression | —-(p*)| as =1,2,: where k = denotes 
the number of times p occurs in the factorization of 1. Since ¢ is u. a. p., the 
lower limit of »(1) as 1—> « is 0. Thus the lower limit of | ¢(p'*/) —¢(p")| 
as k—> 0 is 0, so that (12) holds. This completes the proof of I. 


II. The additive function g(n) is bounded if and only if (7%) holds, 
and in that case the series (5) representing g(n) ts uniformly convergent. 


First, if (7) holds, then the absolute value of any g(7) is not more than 
the sum of the series in (7), so that g is bounded. On the other hand, if 
|g(n)| SM holds for every n, then | 3’9(p")| = M, where the sum »’ is 
taken over any finite number of distinct primes p and the exponents are 
arbitrary. But, this implies that 3, | g(p*)| converges uniformly in the expo- 
nents k, and so (7), holds and (5) is uniformly convergent. This completes 


the proof of II. 


modification has only been indicated, the following reduction of Theorem 4 to a con- 
jecture of Wintner which was subsequently proved by Bohr (Danske videnskabernes 
Selskab X, vol. 10 (1930)) might be useful. Let P be a translation number of h(n) 
Which belongs to the value 1. Then h(n) =h(u+ is, for 
function for which [hy (m +1) —A,, (n) |=1. Thus the continuous u.a. p. function 
obtained from h(n) by linear interpolation does not come arbitrarily near to 0. On 
applying to this continuous function the result of Bohr, one obtains the constant ¢, and 
the function with the properties stated in Theorem 4. 

*This lemma is closely related to a result of Toeplitz, Mathematische Annalen, 
vol. 98 (1927), p. 282. 


= 


630 E. R. VAN KAMPEN. 


Ill. Jf the additive function g(n) is u.a.p., then (7) and (8) are 
satisfied. Since (7) follows from the boundedness of g, by II, it remains to 
prove that (8) holds, 

Let po be a fixed prime and let g°(n) denote the summand of g in (5) 
which corresponds to po. If w(l) and p°(1) denote the translation functions 
of g and g®, it will first be shown that 


{13) p?(21) S holds for 1,2,:-:-. 


Let « > 0 be given and let a partial sum g’ = g/() of the uniformly con- 
vergent series (5) be determined in such a way that | g(n) —g’(n)! <e« 
for every n, andj that the term of (5) which corresponds to the prime jy is 
one of the terms in g’. Then, if »’(/) denotes the translation function of 4, 
one has p’ (21) S p(2l) + Since p°(21) and p(2l) are independent of « 
it is clear that (13) will follow if one proves that p°(2l) S p’(2l) for 
bow 

To this effect, let 1 and n be given and let & be determined in such a way 
that neither n nor n + 21 is divisible by po“. Then, from (6), 


+ = and g°(n -+ 21 + rpo*) g’(n + 21), 
(r= 1,2,° °°). 


The number + may be determined in such a way that m-—+ rpo* and 
n-+ 21+ rp * are not divisible by any of the primes (except po) which 
correspond to summands occurring in f.4 For this r one has 


g (n+ (n+ rpok) and g/(n + 21+ rpok) = 9° (n + 21+ rp), 


since every summand of g (with the exception of g°) is 0 at the values of n 
in question. Hence 


+ — (n + =| + 21) — g°(n) 


so that p°(2l) S p’(2l) by the definition of a translation function. This 
completes the proof of (13). 

Since g is u.a.p., one has lim inf #(2/) =0 as 1— o. Thus, from (13), 
also lim inf p°(2/) =0 as 1— «. But it is clear from the proof of I that 
the last relation implies the existence of the limit (8) for p= po. Since po 
was an arbitrary prime number, the proof of III is complete. 

The proof of Theorem 2 is now evident. In fact, if (7) and (8) hold 
for the additive function g, then g is, by IT the sum of the uniformly con- 


* In fact, r is a solution of a system of linear congruences to distinet prime moduls. 
Note that at least one of the numbers n and n + 21+ 1 is even, so that the restriction 
to even translations 2/ is necessary. 


ho 


Sil 


ye 

fu 
al 
cl 
ad 

| 

Is 
€ 

ta 

eX 
ov 

k. 

A 

i 

| 

(li 

{ al 

(0 

is 

a 

; 

st 
T 


h 


UNIFORMLY ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 631 


vergent series (5), the terms of which are u.a.p., by I. Thus g is a u.a. p. 
function. On the other hand, if the additive function g is u.a.p., then (7) 
and (8) hold, by IIT. Thus the proof of Theorem 2 is complete. And it is 
clear from I, IT and Theorem, 2, that the part of Theorem 1 which concerns 


additive functions is correct. 


IV. .1 multiplicative function f(n) satisfies (9) if and only if f(n) ts 
bounded and the product (2) representing f(n) is uniformly convergent. 


That (9) implies the boundedness and the uniform convergence of (2) 
is obvious. But, if (2) is uniformly convergent, then there exists for every 
e>0 a prime number q such that | 1—TII’f(p*)| <e if the product II’ is 
taken over any finite number of primes p larger than q, and the & are arbitrary 
exponents. If in addition f is bounded, then the product Ipf(p*) (taken 
over all primes) converges absolutely-uniformly with respect to, the exponents 
ik. Thus &, | f(p*) —1| converges uniformly with respect to the exponents k. 
And this. obviously, implies (9), so that the proof of IV is complete. 

V. If f(n) isa ua. p. function and if (9) holds, i.e., if the product (2) 
is uniformly convergent, then (10) holds also. 

The proof of this statement will be omitted, since it may be obtained from 


the proof of III by unessential modifications. 


VI. Jf f(m) ts a non-negative, multiplicative u.a.p. function, then f 
satisfies condition (9). 


Let M be an upper bound of f, so that 0S fS M for every ». Then one 
has OS I’ f(p*) S M, where the product is taken over any finite number of 
distinct primes and the exponents & are arbitrary. Thus 3, (1. u. b.cf(p*) — 1) 
is convergent, and (9) will follow if it is proved that %)(1 — gr. 1. baf(p*) ) 
also is convergent (note that f(p°) =1). 

Let the integer L be such that every group of L consecutive integers 
contains a translation number of f(n) belonging to $. If %,(1 —gr.1. baf(p*)) 
is divergent, so that Il, gr. 1. bc f(p*) = 0, one can easily construct numbers 
(ni, mj) == 1 if 147; (4, +, B). 


f(ni) < 


Since no two numbers n; have a common factor, there exists an integer N 
such that 
N +i=n; (mod n;’), for 


Then n; and (N + 71)/ni =n’; are relative prime integers, so that 


632 E. R. VAN KAMPEN. 


f(N +i) S MU <4, 


This contradicts f(1) =1, since there must exist at least one iy such that 
0<% SLand | f(N+%) —f(1)| < 4. Thus — gr. 1. ) is con- 
vergent and so (9) holds for a non-negative, u. a. p., multiplicative function f, 
This completes the proof * of VI. 

It is clear from I and IV that the part of Theorem 1 which concerns 
multiplicative functions is an immediate consequence of Theorem 3. And, 
if a multiplicative function f satisfies (9) and (10), then the product (2) 
which represents f is uniformly convergent, by IV, and the factors of this 
product are u.a.p. functions, by I. Thus, in this case f is u.a. p., and the 
proof of the sufficiency of the conditions in Theorem 3 is complete. Now 
suppose that the real non-negative multiplicative function f is u.a.p. Then 
f satisfies (9), by VI, and hence f satisfies (10), by V. Thus the proof of 
the remaining part of Theorem 3 is complete in case the u. a. p. multiplicative 
function f is supposed to be non-negative. 

It is clear from V, that the proof of Theorem 3 will be complete if the 
following analogue of VI for real-valued multiplicative functions is proved: 

VII. If fisareal-valued multiplicative u.a. p. function, then f satisfies (9), 

First, by VI, the function | f | satisfies (9), so that 
(14) Xp leu. bx | | f(p*)|-—1| is convergent. 

Let p1,°°*, Pm be a finite number of distinct primes, including all those primes 
for which gr. 1. | f(p*)| = 0. Then one has, as a consequence of (14), 
II, gr. 1. bx | f(p*)| >, 
where the product is taken over all primes except pi,° °°, pm. 
Next, let A(7) be the multiplicative function which is defined by A() =0 


or A(n) =f (n)/| 


the primes Pm. Then A(n) is a u.a.p. function. For if denotes 


according as ” is or is not divisible by at least one of 


the (periodic) function which is 1 or 0 according as n is or is not divisible by 
at least one of the primes pm, then = (1—x«(n))| f(m) + «(n) 
is a real-valued u.a.p. function with the positive lower bound c. And s0 
A(n) = (x(n) *—«(n))f (nm) also is a u.a. p. function. 

Because f(m) is real valued, the u.a.p. function A(m) can only assume 
the three values 0, 1 and —1, so that A(m) is periodic. Since A(n) is multi- 
plicative also, one has A(p*) == 1 for every & if the prime p is not a divisor 
of the primitive period of A(n). Hence it is clear from (14) and the definition 
of A(n) that (9) holds for f(n). This completes the proof of VII, hence also 
the proof of Theorem 3. 


5 This proof was communicated to me by P. Erdos. 


| 
the 
is 
In 
prs 
ass 
age 
of 
be 
an 
wl 
(de 
m 
( 1 
if 
i 
Ws 
i pl 
) 
| | 
| 
V 
4 hi 
in 
( 
W 
sh 
a 


UNIFORMLY ALMOST PERIODIC MULTIPLICATIVE FUNCTIONS. 633 


It is easily seen that the proof of VII may be extended to the case where 
A(n) is restricted to assume a finite number of values. Thus one obtains: 


VIII. If the multiplicative function f(n) is u.a. p. and tf in addition 
there exists an integer r such that f(p*)" is non-negative for every k unless p 
is one of a finite number of primes, then f salisfies conditions (9) and (10). 


In fact, if this finite number of primes is included among the primes 
pis** +s Pm, Of the proof of VII, then the function A(n) of that proof can only 
assume as values either 0 or exp (2atk/r), k=1,:--+,7r. Thus A(n) will 
again be a periodic function, so that the proof may be completed as the proof 
of VIT. 

The reduction of the general multiplicative case to the case of additive 
functions modulo 1 will now be formulated and proved. A function y(n) will 


be called additive modulo 1 if 


+ (mod 1) if m2) = 1 
and n, and nz are not divisible by one of a finite number of primes p;,° °°, py, 
while y(n) = 0 for all n which are divisible by one of these primes. Let {c} 


denote the distance from c to the nearest integer. 


IX. The statement that conditions (9) and (10) are necessary for any 
multiplicative function to be u.a. p., is equivalent to the statement that the 


condition 


(15) Spl. u. bx {W(p*)} << + 
is necessary for a function p(n) to be ua. p., ts additive modulo 1, 


For a given u.a.p. multiplicative function f(m), let the u.a.p. multi- 
plicative function A(n) and the periodic function x(n) be defined as in the 
proof of VII. Then h(n) =A(n) + x«(n) is u.a.p. and satisfies | h(n)| =1, 
so that Theorem 4 is applicable to h(7). It may be assumed that the integer 
P of Theorem 4 is divisible by each of the primes p,,°° +, pm of the proof of 
VII. For certain values of the integer uw of Theorem 4, one will have 
h(u+nP) =x«(u+nP) =1, for every n. For the remaining values of u, 


including in particular u = 1, one has «(w+ nP) =O and 
(16) A(u + nP) = exp 2mi(cun + (n = 1,2,° °°), 


where cy is a real constant and y a real-valued u. a. p. function. It will be 
shown that one may choose cy =0 for every wu for which (16) holds. Since 
cn is congruent modulo 1 to a u.a. p. function of (either for all n or on any 
arithmetical sequence of values of ) if and only if ¢ is rational, it will be 
sufficient to show that c, is rational for every u for which (16) holds. Let 


such a w be fixed. 


12 


634 E. R. VAN KAMPEN. 


There exists an arithmetical sequence of values of n such that 


(1+P,u+nP)=1 hence A(1+P)A(u+ nP) =A((1+ P)(u+ nP)), 


Thus, since (16) holds for this uw and for u—1: 


Cr+ + cun + pu(n) = (n+ u+ nP)eu + + u+ nP) (mod 1) 
= Yu(n + u-+ nP) — — Wi (1) — + (mod 1) 
for every n belonging to an arithmetical sequence. Since the right side is 
u. a. p. on this sequence, so is the left side. Thus cy is rational, and one may 
suppose that cy 0 for every u for which (16) holds. The resulting functions 
vu() may be combined into one function wy by placing y(u + nP) = yu(n). 
On applying the definition of A(), one sees that 


(17) f(n)/|f(m)| = exp 2wip(n) (m, Pm) = 1, 
where y(n) is a real-valued u. a. p. function, which may be defined to be 0 
if n is divisible by one of the primes p;,°--, pm. Since f(n) is multiplicative, 
y(n) is additive modulo1. Thus if (15) were a necessary condition for a 
function y(n) of this type to be u. a. p., then (15) would hold for y(n), and 
it would be clear from (17), (14) and V that (9) and (10) were necessary 
conditions for any multiplicative f to be u.a.p. This proves one part of the 
assertion IX. 

Now let w(n) be u. a. p. and additive modulo1. Then the function f(n) 
which is defined by 


= exp 2rip(n) if (n, pips: * pm) =1 

f(n) =0 if * pm) > 1 
is u.a.p. and multiplicative. Thus if (9) were a necessary condition for a 
multiplicative function to be u.a.p., then this f would satisfy (9), so that 
y(n) would satisfy (15). This completes the proof of IX. 

As an example to Theorem 3, consider the function f(n) = o4(n) /n*, « > 9, 
where og(n) denotes the sum of the a-th powers of the divisors of n. It will 
be shown that oa(n)/n* is uniformly almost periodic® if and only if «>1. 
In fact, one has for this f = f(n): 

pu — 1— p* 
= pr y—i 
so that 1. u. bx | f(p*) —1]| = (p*—1)~, and (6) holds or does not hold 
according as a>1o0ra<1. Since f(p*) > (p*—1)++1 as 0 and 
since f(v) > 0 for every n, the above statement follows from Theorem 1. 


—1— 


THE JOHNS HOPKINS UNIVERSITY. 


*The function ¢,(n)/na is almost periodic (B\) for every \ if a >0. This was 
shown loc. cit.1, p. 625. Cf. Ramanujan, Collected Works, Cambridge (1927), p. 184 
formula (6.1). 


if 
| 
d 
( 
V 
( 
a 
h 
i 
| 
u 
4 
f 
\ 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (B*).* 


By Paut Erpés and AUREL WINTNER. 


1. By an additive function f=f(n) is meant a sequence f(1),f(2),---, 
defined for every positive integer n in such a way that 


(1) f(m ne) = f(r.) + f(n2) whenever (m:,n2)=1; (f(1) =0). 
Thus. 


(2) f(n) => f(n) = lim fi(n), 
k=1 
where f, = fx(n) denotes, for fixed k, the additive function 


(3) fu(n) 37 (n), 


and f*! =f) (n) is the additive function which is defined in terms of the 
k-th prime number, px, as follows: 


§ 0, if n 40 (mod px), 


(4) Uf (pe'), if pet|n and 


(p: = 2. ps = 3, ps = +). Conversely, if {{f(px')}} is any given double 
sequence of numbers, then (4), (3), (2) define f™, fi, f, respectively, as 
additive functions of n. In fact, all but a finite number of the terms of the 
infinite series (2) is zero for every fixed n. 

The function f(n) is called multiplicative if in condition (1) the sum 
f(m,) + f(nz) becomes replaced by the product f(n1)f(n2). Conditions which 
are either necessary or sufficient for the almost periodicity (B*) of a multi- 
plicative function f(n) are implied by the results of a recent paper. However, 
none of the results found loc. cit.’ supplies a criterion which is necessary and 
at the same time sufficient for the almost periodicity (B?) of a multiplicative 
function (not even if f(n) is supposed to be real-valued). This situation is 
not surprising, since if a real-valued multiplicative function f(n) changes 
its sign with the uniformity of statistical randomness (as does the Mobius 
function f =»), then the question as to a generalized almost periodic behavior 


* Received December 4, 1939. 

*E. R. van Kampen and Aurel Wintner, “On the almost periodic behavior of 
multiplicative number-theoretical functions,” American Journal of Mathematics, vol. 62 
(1940), pp. 613-626. 

635 


om, 


636 PAUL ERDOS AND AUREL WINTNER. 


of f(#) can involve problems of the same order of delicacy as do the relevant 
generalisations of the prime-number theorem, if not of the Riemann hypothesis, 
[While the prime-number theorem is equivalent to the statement that the 
n-average of exp (tAn) exists for A = 0, Davenport’s results (Quarterly 
Journal of Mathematics, vol. 8 (1937), pp. 313-320), which were obtained 
by an application of the deep methods of Vinogradoff, imply that this average 
exists and vanishes for every real A. In other words, all Fourier coefficients 
of »(n) exist and vanish. Hence, »(n) cannot be almost periodic (B). For 


if it were, the n-average of 
| — (0+ =] =| a(n) |? 


ought to vanish. But this average is known to be 62? ~ 0.] 

The object of the present paper is to show that the problem admits of a 
definitive solution in the case of additive, instead of multiplicative, functions. 
In fact, the question of almost periodicity (B?) may then completely be 


answered by the following theorem: 


An additive function f =f (n) is almost periodic (B*) tf and only if 


both series 


are convergent. 


This fact seems to be an arithmetical counterpart of a similar result con- 
cerning the case of linearly independent frequencies (cf. loc. cit.®. pp. 79-80). 
But we were unable to find the common source of these two parallel theorems. 


It is understood that % denotes summation over all prime numbers, which 
p 
are thought of as ordered according to magnitude (the series (i) need not 


be absolutely convergent). 


2. If f’ denotes the real, and f” the imaginary, part of f, the function 
f(n) =f’(n) + if’(n) is additive if and only if so are both functions f’(”). 
f’(n). Similarly, f(m) is almost periodic (B*) if and only if so are f’(”) 
and f’(n). Finally, it is clear from | f |? = (f’)? + (f”)? that both series 
(i), (ii) are convergent if and only if so are the 2-+ 2 series which one 
obtains by writing f’ and f” for f in (i), (ii). 

Consequently, it is sufficient to prove the italicized theorem for the case 
of real-valued additive functions. The possibility of this reduction is essential 
for the method to be applied. In fact, use will be made of a criterion which 


i 
H 
| 
| 
| 
| 
lf 
| 
# 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (B?), 637 


recently ? was proved to be necessary and sufficient for those real-valued addi- 
tive functions which possess an asymptotic distribution function. Now, a 
generalization of this criterion for complex-valued functions is not known and 
seems to lead to essential difficulties. 

The criterion in question states * that a real-valued additive function f(1) 
has an asymptotic distribution if and only if both series 


p 


are convergent, where y* = f*() is defined, for y= f(n), by placing 
(5) y' =y or y* =1 according as |y| <1 or |y| 21. 


It follows that the convergence of both series (1), (II) is necessary for 
every (real-valued, additive) f which is almost periodic (B*). In fact, it is 
known * that almost periodicity in relative measure and so, in particular, 
almost periodicity in relative mean of any positive order (= 2 in the present 
case) is always sufficient for the existence of an asymptotic distribution 


function. 
2 bis. Suppose, in particular, that f(p) = O(1) as p— o. Then, since 
6 
CO. 


the series (ii) of § 1 is convergent if and only if so is the series 


~? 
— 


? 


hence, one readily sees from (5) that the convergence of the series (i), (11) 
which occur in the criterion of §1 is equivalent to the convergence of the 
respective series (I), (11) which occur in the criterion of § 2. 


3. For arbitrary additive functions f, the italicized statement of § 1 will 
be refined by exhibiting, in case of almost periodicity (B*), a sequence of 
functions which are explicitly defined in terms of f, tend to f with reference 


* Paul Erdés and Aurel Wintner, “ Additive arithmetical functions and statistical 
independence,” American Journal of Mathematics, vol. 61 (1939), pp. 713-721. 

* Borge Jessen and Aurel Wintner, “ Distribution functions and the Riemann zeta 
function,” Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-88, 
more particularly Theorem 24 (and Theorem 25). 


= 


638 PAUL ERDOS AND AUREL WINTNER. 


to the metric of the space (B*), and are almost periodic (B*). In fact, it 
turns out that f cannot be almost periodic (B*) unless it is almost periodic 
(B*) in virtue of its expansion (2). In other words, if f is almost periodic 
(B?), then, on the one hand, each of the functions f®’, f,: - - is almost 
periodic (B*), and, on, the other hand, 


k 
j=1 
where M{g} =lim 
no  I=1 


3 bis. Due to this fact, it will be possible to calculate the Fourier series 
of f in terms of the Ramanujan sums 


(9) = 3; exp n), where 1 and (j,m) —1. 
In fact, the explicit form of the Fourier expansion of an arbitrary additive, 


almost periodic (B?) function f(n) turns out to be 


(10) do + (1), 
k 


where |==1,2,3,---, k==1,2,3,- - and 


i=l 


Since (9) consists of ¢(m) terms (¢ = Euler’s function), and since $(p’') 
= p'— p'", the Parseval relation belonging to (10) is 


(12) f°} =| ao |? + (pe! — | aw 


(11) 


4. It is easy to show that if f is such as to make the series (ii) of §1 
convergent, then each of the functions f; is almost periodic (B’). 

To this end, use will be made of the following fact, proved loc. cit.’ 
(Theorem II): If a function g =g(n) of the positive integer n is such that, 
for some fixed prime number p, one has 


(13) g(n) = g(p') whenever p'|n and p'*fn, 


then g is almost periodic (B*) if and only if 


| [2 
‘ 


It is clear from (4) that condition (13) is satisfied by g = f‘*' and 


ji 
i 
| 
i 
H 
| 
iy 
| 
. 
4 
iq 
4 
j 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (B?). 639 


p= Px, where k is arbitrarily fixed. Furthermore, if f is such as to make 
the series (ii) convergent, then, for every fixed k, 


00 

(15) = Lf(pe!)| 
so that, since f(px') =f“ (px) in view of (4), condition (14) also is 
satisfied by g =f and p== px. Consequently, f® is almost periodic (B*). 
Since & is arbitrary, and since the almost periodic (B*) functions form a linear 
space, the almost periodicity (B*) of f; now follows from (3). 

4bis. It was shown loc. cit.1 (Theorem III) that if a function g() 


satisfies’ (13) for some fixed prime p and is almost periodic (B), then its 
Fourier expansion is 


oo i-1 
l i=l 
It follows therefore from §4 that if f is such as to make the series (ii) 
convergent, then, for every k, 
fk) (mt) fk) ( 
f® (n) ~ M{f®} + aucp,'(n), where ax = (put) (pel) ; 
l i=l 
Hence, (10) with (11) will follow from (4) as soon as it is proved that, on 
the one hand, the convergence of the series (ii) is a necessary condition for 
the almost periodicity (B*) of f, and that, on the other hand, f must satisfy 
(8) whenever it is almost periodic (B?). 


Proof of the sufficiency of the conditions. 


From here on till the end of § 9, the assumption will be that f(7) is a 
real additive function for which both series (i), (ji) of §1 are convergent. 
The final result (§ 9) will be that f(7) must then be almost periodic (B*). 


5. In terms of the given f(n), define an F'(m) as follows: F(n) is that 
additive function for which the double sequence {{F'(px')}} is given by 


(p'), if | f(p)| = 1; 
(16 F(p') = 
Where p= p, and k —1,2,3,---. 
It is easy to see that the convergence of the series (ii) implies that 
| F 
(17) 
l=1 p Pp 


In fact, it is clear from (16) that the series (17) is majorized by A ++ B+ C, 
where 


640 PAUL ERDOS AND AUREL WINTNER. 


Ad oo l oo 
Pp l=2 p l=2 p 
and so it is sufficient to prove the convergence of these three series. But 
application of (16) to /==1 shows that F(p) =0 unless | F(p)| = 1, in 
which case F(p) = f(p) ; so that the series A reduces to 
| f(p)| 


* 


and is therefore convergent in virtue of the assumption that the series (ii) 
converges. It is clear from the same assumption and from (6), that also the 
series B is convergent. Finally, the series C may be written in the form 


oo | f 00 | 


But the convergence of the first of these two double series is assured by (6), 


while the second is, in view of 


majorized by 
|f(p|21 


Since the value of the latter series was seen to be A < «, the proof of (17) 
is now complete. 
Similarly, 
F(p') |? 
(18) < 


p p' 


In fact, since (a— 6b)? S 2(a* + b*) for arbitrary real a,b, one sees from 
(16) that the series (18) is majorized by A’ + B’ + C’ where 


Ae ("=23 | f(p))/? 


P Dp . l=2 p Pp 


And the proof for the convergence of these three series requires but a repeti- 
tion (with obvious simplifications) of the above proof for the convergence of 
the three series A, B, C. 

Notice that only the convergence of the second of the series (i), (il) 
was used thus far. The same remark will hold for § 6. 


~ 


6. It will now be shown that if F,(n) denotes the additive function 


p)| 
if 
| 
hi 
ay 


ut 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (B?). 641 


which belongs to the additive function F'(n) in the same way as (3) belongs 


to then 
(19) M{| F —F, 


0 as kw, 


where g |?} =lim sup— | |?. 
[=1 


To this end, notice first that, by the definition (16) of the additive 
function F'(n), 


| F(m) —F;.(m) |? 


< 333 3% S/F 
j=tp>k a>k l=lp>h p' 


where n and & are arbitrarily fixed, and the summation indices p,q run 
through those primes which exceed k. On writing this inequality in the form 


= F(m) —Fa(m) (3 >> (p")I 
| l=-1p>k P l=1 Pp 


keeping / fixed but letting n— 2, one sees that 


(19 bis ) M {| F |?} 
where 
p>k Pp 1ip>k 


But these sums e,, ¢ are identical with the k-th remainders of the convergent 
series (17), (18), respectively, and tend therefore to zero as k— &. Hence, 
(19) is implied by (19 bis). 


7. If G,—G,(n) denotes the additive function which belongs to the 
additive function 


(20) G=f—F 
in the same way as fy. / belong to f, F respectively, then obviously 
(21) Ge = fix — Py. 


Thus, it is clear from (16) that, for any fixed hk, the elements of the double 
sequence {{G(p') — Gx(p')}} of the additive function G(n) — G(n) of n 
are independent of J, i. e., that 


(22) G(p) — Gi(p) = G(p’) — = G(p*) — Gk(p*) =: 


for every prime p. It ia also seen from (16) and (20) that 


| 


642 PAUL ERDOS AND AUREL WINTNER. 


(23) | G(p)|S1 
for every prime p. 

Since the series (i)-of §1 is supposed to be convergent, it is clear from 
(20) and (17) that also 


(24) > 
p 


is convergent. 


Similarly, since the series (ii) of § 1 is supposed to be convergent. it is clear 
from (20), from the Schwarz inequality 
f(p /_ F(p)?\3 
p » P p Pp 
and from (18), that 


1 2 
(25) 


8. Due to (22), it is now easy to transcribe the O-estimates applied 
loc. cit.2 (p. 716) into o-estimates, which are to the effect that 


(26) M{| G— |?} 30 as 
In fact, (26) may be proved as follows: 


If n and k are arbitrarily fixed, one readily verifies from (22) and from 
the definitions of the real additive functions G, G;, that 


3, G(m) —Ga(m) G(p)G(q) + G(p)?, 


p>ka>k 


where [a] denotes the integral part of x, the prime of 23’ means that p~ 4. 
and the summation indices p,q run through those primes which exceed i 
(however, the sums on the right are finite sums for every fixed », since 


[= =0 and H = 0 whenever pg >n and 
respectively). Consequently, 


(26 bis) | G(m) G,(m) |? 


kepant P ntspsn kSqSn/p 


++ x | 3 
p>is 


8 bis. As to the inner sum in the second of the four terms on the right 


HA 


a 


Se 


fi 


G(p) 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (5°). 643 


of (26 bis), one sees from (24) that if & is fixed and e denotes the maximum 
of the function 
G(q) 


| kSqSn/p 


of pand on the range S pln; n—1,2,---, thne® 
while the absolute value of the whole second term on the right of (26 bis) has, 
for every n, the majorant 


3 G(p) | > 1 
ni<pSn ni<psn 
by (23). Finally, 
1 
— & | G(p)G(q)|S— 1, by (28). 
n 


n 


Thus, on keeping & fixed but letting n ~ , one sees from (26 bis) that 


lim cup | G(m) — Ge(m) |? 


n>00 m=1 


2 2 
n->00 repent P ni<p<n P p>k Pp 
But p and q are prime numbers; so that 
—< Const. and 1>0 
ni<psn 
asn—>o. Hence, 
G(p)* 
M {| G—- G;, |?} S lim sup ( > + const.e® + ‘ 
n->00 k<psnt p>k p 


On letting here k > «, and using the fact « —0 as k—> ©, one sees from 
(24) and (25) that the proof of (26) is complete. 


9. It is now easy to conclude that f(m) is almost periodic (B*) and 
satisfies (8). 

In fact, since it was proved in §4 that f,; is almost periodic (B?) in 
virtue of the convergence of the series (ii), it is sufficient to show that 


M{|\f—fe |?} as ka ow. 


But the truth of this relation is implied by (19) and (26), since it is clear 
from (20) and (21) that 


J 
A 
, 


644 PAUL ERDOS AND AUREL WINTNER. 


Proof of the necessity of the conditions. 


What remains to be proved is that the sufficient condition represented by 
the convergence of the two series (i), (ii) of §1 is necessary as well. Thus, 
from now on till the end of the paper, the assumption will be that the f(n) 
is any given real, additive function which is almost periodic (B?). 

10. Since f(m) has an asymptotic distribution function, the two series 
(1), (II) of §2 are convergent. And, in view of (5), the convergence of 
(II) implies that 

1 


27) 
If(p|21 


In terms of the given f(n), define an additive function D(n) by placing 
(28) D=f—H, 


where H = H(n) denotes that additive function for which the double sequence 
{{H(p')}} is given by 

f(p'), if LAI, 
(29) H(p')=~f(p), if T=1 and | f(p)| 21, 

0, if J=1 and |f(p)| <1, 


(p=p,. and k —1,2,3,---). Thus, 


0, if 1-1, 
D(p') =< 0, if 1=1 and | f(p)| 21, 
(p), if L=1 and | f(p)| <1, 


and so it is clear from (27) that one obtains two convergent series by writing 
D for f in (i)-(ii), §1. Since the first half of the italicized statement of $1 
was already proved (§ 5-§9), it follows that D(n) is almost periodic (B’). 
Since f(n) is almost periodic (B?) by assumption, one sees from (28) that 
H(n) is almost periodic (B?). 

In particular, H(n) has a square-average 


(30) M{H*} < + 


11. In what follows, r will denote any of those prime numbers for which 
the absolute value of the given additive function f is not less than 1. Clearly, 
(27) may be written in the form 


(31) (:—1) >o. 
[f(r) r 


Since also the density of the quadratfrei integers is a positive number 
(= 62-*), a standard application of the sieve of Erathostenes shows that (31) 


ma 
let 
are 
mu 
cor 


tha 


wh 
th 
in 
an 
(3% 
wh 
(3: 
an 
(3 
wh 
(3: 
(I 
(31 
an 
(3 
(3 


ADDITIVE FUNCTIONS AND ALMOST PERIODICITY (B?). 645 


may be interpreted as follows: If ,j are positive integers and p is a prime, 
let VN = N(n, p, 7) denote the number of those integers between 1 and n which 
are of the form p/s, where s is quadratfrei, is not a multiple of p, and not a 
multiple of any of the primes r (defined by | f()| 21). Then there exists a 
constant 8 > 0 which is independent of n, p,j and is such that 
N = N(n, p,j) > 

Hence, it is clear from the definition (29) of the additive function H(»), 

that 
¥H(m)?> 
plon DP 
where the summation indices p(= 2, 3,5,---) and 1(= 1, 2,---) run through 
those of their combinations for which p! <n. Thus, on writing this inequality 
in the form 
t\2 n 
> < const. = H(m)?, (const. = < 
and letting 7 —> ~, one sees from (30) that 
(32) 
I=. p 


where p runs through all primes. 


12. In view of (29), the content of (32) is that, on the one hand, 


2 
(33) >> Mp)" <0 
l=2 p 
and, on the other hand, 
(34) 
while (34) implies that 
(35) 
lf (p)|Z1 Pp 


Finally, as pointed out at the beginning of §10 (cf. $2), the series 
(I), (II) of § 2 are convergent. This means, in view of (5), that 


(36) 
f(p)|S1 P 

and that also 

(37) > f(p) is convergent. 
P 


Now, the convergence of the series (i) and (ii) of §1 is clear from 
(37), (35) and (36), (34), (33), respectively. 

INSTITUTE FOR ADVANCED STUDY, 

JOHNS Hopkins UNIVERSITY. 


STATISTICAL INDEPENDENCE AND STATISTICAL 
EQUILIBRIUM.* 


By PHiLip HARTMAN and AUREL WINTNER. 


Consider a conservative dynamical system which has a finite number of 
degrees of freedom and a Hamiltonian function possessing everywhere con- 
tinuous partial derivatives of the second order. Suppose that some fixed value 
of the energy constant h determines a closed, bounded energy surface 2 = (h) 
in the phase-space; and that this 2 does not'contain too many or too high 
critical points (e.g., that no point of © is an equilibrium solution of the 
system). If P is any point of Q, the isoenergetic differential equations deter- 
mine on 2 a unique phase-path P; for which Py =P, and which exists for 
—o<t<-+ oo. The resulting isoenergetic flow on 2 may also be described 
by placing P: = where <t<-+ 0, is for any fixed ¢ a topo- 
logical transformation of © into itself, and the function 7+P of (t, P) is 
continuous on the product space of © and the ¢-axis. If one projects the 
euclidean Lebesgue measure of the phase-space on the energy surface Q in 
the usual way,’ and denotes by »(/) the resulting Lebesgue measure of an 
arbitrary Borel subset of Q, then = for every F and 1, since 
the isoenergetic differential equations which define 7; satisfy the condition of 
Liouville.? 

Since obviously 0 < p(Q) < «, it may be assumed that p(Q) = 1. Thus, 
Birkhoff’s ergodic theorem is applicable* to the flow 7+ on Q, and states that 
the path P; has an asymptotic distribution function unless the initial condition 
P =P, is chosen on a set of w-measure 0. It is understood that by the 
asymptotic distribution function ¢p of a path P; is meant an absolutely 
additive set function ¢p = dp(F), defined for all Borel subsets F of © and 


* Received February 14, 1940. 

1Cf. e.g., T. Levi-Civita, Journal of Mathematics and Physics (M.1.T.), vol. 18 
(1934), pp. 22-23. 

? For n = 2, ef. the explicit equations of G. D. Birkhoff, Transactions of the Ameri- 
can Mathematical Society, vol. 18 (1917), pp. 211-212. 

3G. D. Birkhoff, Proceedings of the National Academy, vol. 17 (1931), pp. 656-660. 
The necessity of excluding possible discontinuity sets (cf. footnote‘) of the asymptotic 
distribution function belonging to a general P was pointed out by A. Wintner, ibid. 
vol. 18 (1932), pp. 248-251; cf. P. Hartman and A. Wintner, American Journal of 
Mathematics, vol. 61 (1939), pp. 977-984. 

Cf. also A. Wintner, Nature, vol. 145 (1940), pp. 225-226. 


646 


STATISTICAL INDEPENDENCE AND STATISTICAL EQUILIBRIUM. 647 


having the property that if # is any continuity set* of ¢p, then the f-set 
defined by P: C £ is relatively measurable, and has ¢p(£) as its relative 
measure.” In other words, @p(£) is the probability that the path Pt, 
—* <t<-+ ~, which is determined by ? = Po, be in the portion F of Q. 
Since 2 is compact, the total probability ¢p(Q) is 1. 

Any given Borel set # is a continuity set of ¢p(/#) for almost all P. The 
proof of this fact will be omitted, since it readily follows from an estimate 
which occurs in Birkhoff’s proof * of the ergodic theorem.® 

The fact just mentioned, when combined with Lebesgue’s term-by-term 
integration of uniformly bounded sequences, obviously implies that 


for every Borel subset F of Q. 

Consider the product space 2 X Q consisting’ of all pairs (P, Y) of points 
of X. Obviously, products LX F of Borel sets of Q are Borel sets of 2 XK Q. 
If on Q & Q a Lebesgue measure v is defined by placing v(# XK F) =p(L£)p(F) 
for Borel sets  & F, Birkhoff’s ergodic theorem is obviously applicable to 
the product flow 7: X 7+, with v as the invariant measure on 2 XQ. Let dpe 
denote the asymptotic distribution function of the path = 
where it is understood that a set of initial points (P,Q) of v-measure 0 must 
in general be excluded. 

In what follows, use will be made of the fact that if gx(P) denotes the 
characteristic function of a Borel set K of Q, then 


v 
(2) lim f ( gu(P1) ( gr(Qt) dou) dt = f X F) dpov 
B 


¢-u9 0 
AXB 


*By a continuity set of a distribution function is meant any Borel set H# which 
has the property that the distribution function attains the same value for the two Borel 
sets which represent the exterior and the closure of H. It is known that the Borel sets 
Which are not continuity sets of a fixed distribution function are exceptional in the 
same sense as the discontinuity points of a fixed monotone function of a single variable. 

5A measurable set 7’ of points of the t-axis is said to be relatively measurable 
if the Lebesgue t-measure of the common part of 7 and a finite interval wStSv, 
when divided by the length v—u of this interval, tends to a limit as v—u>%; 
in which case this limit is called the relative measure of 7’. 

* As to this estimate, cf. N. Wiener, Duke Mathematical Journal, vol. 5 (1939), 
pp. 1-18 (¢f. p. 2). 

"It should be emphasized that this product space is meant in the usual topological 
sense and is not, as it somehow became customary in ergodic theory, the symmetric prod- 
uct space. In other words, the points (P,Q) and (Q,P) of 2 & © will not he identified 
in the present paper. 


= 


648 PHILIP HARTMAN AND AUREL WINTNER. 


holds for arbitrary Borel subsets A, B, H, F of 2. In fact, if F and F are 
fixed, the remark which precedes (1) shows that # X F is a continuity set of 
dry for almost all points (P,Q) of QQ. On the other hand, the ergodic 
theorem, when applied in its usual form to the fixed point function 
f=f(P,2) = 9e(P)gr(Q) on XQ, states that the limit 


lim 

U—Udy 

exists for almost all points (P,Q) of © XQ. Since the definition of the 
asymptotic distribution function @pg implies that the latter limit has the 
value I’) whenever F is a continuity set of dpe, it follows that, 


if FL and F are ‘fixed, 


Ps) gr(Qi)dt = X F) 

holds for almost all points (1?,Q) of OQ X Q. Hence, (2) follows by Lebesgue’s 

theorem on term-by-term integration of uniformly bounded sequences. 

Two solution paths on Q are said to be statistically independent 
if the three asymptotic distribution functions dpe, dp, de exist and satisfy the 
product condition 
(3) pre(li F) = do(F) 
for all Borel sets of Q Q,2,02 which are continuity sets of 
re, dp, pa, respectively. 

It turns out that the incompressible flows t+ on Q which possess this 
property of the statistical independence of almost all pairs of paths on Q are 
interrelated with the incompressible flows ++ on Q which possess there a 
property of statistical equilibrium. From the physical point of view of sta- 
tistical mechanics, this ‘somewhat hidden interrelation between statistical 
independence and statistical equilibrium might perhaps have been expected. 
But we were unable to find any reference in the literature to the interrelation 
of these two physical concepts. On the other hand, the mathematical literature 
contains all of the tools necessary for this identification. In fact, Birkhoff’s 
ergodic theorem, when stated as above in terms of asymptotic distribu- 
tion functions,* insures that the notion of statistical independence may be 
meant in its mathematical formulation, used loc. cit.7; while it is known that 
the notion of statistical equilibrium may be approached mathematically as 


follows: 


® Cf. P. Hartman, E. R. van Kampen and A. Wintner, American Journal of Mathe- 


matics, vol. 61 (1939), pp. 477-486. 
°Cf. E. Hopf, Proceedings of the National Academy, vol. 18 (1932), pp. 333-340. 


Py 


I 
t 
I 
a 
a 
t 

e 
( 
fe 
d 
( 

| 

( 

t 

( 

st 
li 
is 
( 
th 


STATISTICAL INDEPENDENCE AND STATISTICAL EQUILIBRIUM. 649 


Suppose that the flow 7 has the property that if there are given any 
Borel subset # of Q and any “density of probability” as an integrable func- 
tion f= f(P) of P on Q, then the probability carried, by the set into which 
FE is shifted by the flow 7; tends to a limit as t— 0. If this condition is 
satisfied, i. e., if any given initial probability distribution is transformed in such 
a way as to become practically independent of 4 for large ¢t, with reference to 
any Borel set /£, then the flow rz is said to tend to statistical equilibrium. 
Since it may be shown *° that, instead of arbitrary integrable functions f, it is 
sufficient to consider characteristic functions gr(/) of arbitrary Borel sets I’, 
the condition for a flow 7; to tend to'statistical equilibrium consists of the 
existence of the limit °® 
(4) lim p( F), 

for any pair L, F of Borel subsets of QO. In fact, condition (4), where A: B 
denotes the common part of A and B, is precisely the previous definition, 
since obviously 

Et 
It is clear from (4) that if the limit (3) exists, its value is 
(6) lim F) = f op(F) f 

to 

where ¢p is the asymptotic distribution function of P;. If it is only required 
that »(H#;:F) should become practically independent of / on the average, in 
the sense that, instead of the existence of the limit (6), one merely has ® 


(7) lim . f (Li F) pp( I") dpp|*dt = () 
v-u>00 U— 
E 


for any pair EZ, F of'Borel subsets of ©, then the flow 7; is said to tend to 
statistical equilibrium on the average. While it is clear that statistical equi- 
librium is sufficient for statistical equilibrium on the average, the converse is 
not true, at least! if the flow 7: is not required to be one determined by an 
isoenergetic dynamical system. Incidentally, the content of the requirement 
(7) remains unchanged if one replaces the square [ ]? by |[ ||; in fact, 
the integrand [ is a hounded function of ¢, since 0S 


” Cf. G. D. Birkhoff, loc. cit.*; N. Wiener, loc. cit. °. 
4 An example to this effect was given by B. O. Koopman and J. v. Neumann, 
Proceedings of the National Academy, vol. 18 (1932), pp. 255-263. 


13 


650 PHILIP HARTMAN AND AUREL WINTNER. 


The interrelation between statistical independence and statistical equi- 
librium, as announced above, may now be formultaed as follows: Jn ordey 
that a flow be such as to make almost all pawrs of paths statistically inde- 
pendent, it is sufficient (but, at least in case of a non-dynamical flow, not 
necessary) that it tend to statistical equilibrium ; in fact, almost all pairs of 
paths are statistically independent if and only if the flow tends to statistical 
equilibrium on the average. 

That a flow 7+ which makes almost all pairs of paths statistically inde- 
pendent is necessarily a flow tending to statistical equilibrium on the average, 
is implied by the second half of Theorem 5 of EK. Hopf.® In fact, one can 
easily prove that his Theorem 5 is to the following effect: There is tendency 
toward statistical equilibrium on the average if and only if the condition (3), 
instead of being satisfied for all pairs (1, F’), is satisfied for symmetric pairs 
(£,E) only (it being understood that a zero set of pairs of points (P,Q) 
is always excluded). Apparently, it is this symmetry restriction '* which has 
thus far disguised the interrelation between statistical independence and 
statistical equilibrium (either strict or average). In fact, as will be shown 
in the Appendix, two measurable functions of ¢ need not be statistically 
independent if the condition corresponding to (3) is required for symmetric 
pairs (2, = only. 

Nevertheless, it will now be showr that almost all pairs of paths are sta- 
tistical independent in the case of a flow which tends to statistical equilibrium 
on the average. 

To this end, suppose that the flow 7; satisfies the average condition (7) 
for arbitrary Borel sets #, F, and write A, B for EF, F; so that 

1 
(7 bis) lim — $o(B) don]*dt — 0. 
U—U Ju 
A 
Since both functions »(A:- B), w(£:: F) of ¢ lie between 0 and 1, it readily 
follows from (7) and (7 bis) that 


12 Cf. footnote 7. 
18 This depends on the following obvious remark: If #(t), y(t) are bounded 
measurable functions for which there exist constants a, ® such that 


] v l v : 
[a(t) —a]*dt [y(t) —B]*dt 0, 


[x(t)y(t) —ap]*dt > 0, 
v—U Ju 


then 


as v—u->. It is sufficient to prove this in case x(t) and y(t) represent the same 


STATISTICAL INDEPENDENCE AND STATISTICAL EQUILIBRIUM. 651 


u 


lim f drm) ( don) = 0, 
A B 
er. if B and £ are interchanged, 


U 


] 
lim F)—( ( b0(F) don) }2dt = 0. 
J 


On the other hand, the identities (2) and (5) imply that 


ff X lim Ay E)p(Bi- F)dt. 


Sut comparison of the last two relations gives 


f XK F)drev = ( f dpp) ( f don). 
A 


AXB B 
Hence, by Fubini’s theorem, 


AXB AXB 


since F) was defined as the product measure Since 
the factors A, B of the integration domain A X B are arbitrary Borel sets of Q, 
it follows from the separability of Q, that the condition (3) of statistical 
independence is satisfied by almost all points (P,Q) of Q & Q. 


If, instead of two paths one considers paths Re, 
? t a 


their statistical independence is defined by the requirement * 


which reduces for n= 2 to (3). It is known*® that =3 given functions 


function, and then successively choose the latter function to be w(t), y(t), «(t) + y(t). 
But if x(t) is bounded, then 

[a(t)?—a?]* = [a(t) + a]*[a(t) —a]? Sconst.[a(t) —a]?. 
(It is seen that it is sufficient to assume the boundedness of only one of the two 


functions «, y.) 


) 
It 
if 
3 
| 


652 PHILIP HARTMAN AND AUREL WINTNER. 


need not be statistically independent if any of the three pairs which may he 
selected from them consists of two statistically independent functions. This 
might be one of the reasons why, on the basis of mere time averages of the 
solutions of the (isoenergetic) differential equations of classical dynamics, no 
mathematical theory has been developed thus far for physical facts of the type 
of the Maxwell-Boltzmann distribution or of the H-theorem. For these facts 
are asymptotic statements of the same type as is the validity of the normal 
distribution law in theory of errors; so that the number n of independent 
realizations of one and same model must be chosen arbitrarily large. since no 
statement can be made for a fixed n (in particular, for n = 2). 

But it turns out that, due to the fact that the product spaces considered 
are not the symmetric product spaces,’ it is not difficult to pass from n = 2 
to any n. While this sounds surprising in view of the examples just men- 
tioned,® all that actually happens is that n-uples of paths, possibly exceptional 
from the point of view of independence, are contained in zero sets which may 
vary with n. In other words, if the flow makes the two paths P:, Qt sta- 
listically independent for almost all choices of (P,Q) on Q XQ, then it also 
makes the n paths Pt,Q:,- - +, Ri statistically independent for almost all 
choices (P,Q,:-:,R) on QXKQX%--+ XQ, where n is arbitrary and it is 
understood that the sets excluded are zero sets with reference to the product 
measures (of ») on XO XA, respectively. 

In fact, it is clear that the calculation following (7 bis) may be carried 
out so as to show that the assumption (7) implies the statistical independence 
of almost all n-uples of solution paths, not only for n = 2 but for arbitrary n. 
Hence, the italicized statement follows from the fact that the requirement (7) 
of ultimate statistical equilibrium on the average was seen to be equivalent to 
the requirement of the statistical independence of almost all pairs of paths. 

It follows that the flow satisfies the requirement (7) of ultimate sta- 
tistical equilibrium on the average tf and only tf the paths in almost all n-uples 
of paths are statistically independent. This fact is, from the physical point 
of view, more important than the equivalent criterion in which n is restricted 
to n=2. In fact, it now becomes admissible to consider a product space 
QaxX2X::+ XQ of arbitrarily many factors, thus introducing the flow on 
Q in n independent copies, and then make n— ©. But this is precisely the 
relevant mathematical assumption of the theory of limit distributions in 
statistical mechanics. 

If the incompressible flow 7; on , instead of satisfying any statistical 
assumption, is such as to make the asymptotic distribution function ¢p of 
the path P; independent of the initial condition P (for almost all P), then 
the flow 7: is necessarily metrically transitive, since (1) then reduces to 


é 


t 
t 
( 
| 


STATISTICAL INDEPENDENCE AND STATISTICAL EQUILIBRIUM. 653 


¢p =p for almost all P. In particular, the class of those flows which satisfy 
the requirement (6) of ultimate statistical equilibrium and are at the same 
time such as to make ¢p independent, of P for almost all P, is ‘identical with 
the class of the flows to which the (in some respect misleading) name 
“ mixture ” was given. 

Hedlund ** has recently proved that if Q is a two-dimensional Riemannian 
manifold of constant negative curvature, of finite connectivity and of finite 
area, then the geodesic flow on Q is a mixture. It follows, therefore, from 
the last italicized theorem, that the geodesic flow on any such Q2 makes the 
paths of almost all n-uples of geodesics statistically independent of each other. 
Notice that in this example one has, besides statistical independence, asymp- 
totic equidistribution of almost all paths; so that no example of an isoenergetic 
dynamical system seems to be known in which almost all pairs of paths are 
statistically independent but which is not metrically transitive. 


APPENDIX. 
It is known that two real measurable functions y(t), —«<t<+o, 
are statistically independent if and only if the Fourier average 


as 


(I) A(u,v) = lim exp t{ua(t) + vy(t) }dl 


Jr 


exists uniformly in every fixed bounded domain of a real (uw, v)-plane and 


satisfies the functional equation 
(IT) A(u,v) = A(u, 0)A(0, 


On the other hand, if instead of statistical independence, which corresponds 
to (3), one requires that the condition corresponding to (3) be satisfied for 
symmetric pairs (1,2) = (£,#) only, then an obvious adaptation of the 
considerations applied loc. cit. shows that (11) must be replaced by the 
weaker condition 


(III) A(u,v) + A(v, vw) =A(Uu,0)A(0, v7) + A(v, 0) w), 


[which is again necessary and sufficient, provided that (1) exists uniformly 
in every fixed bounded domain of the (u,1)-plane, i.e., provided that the 
vector (a(t), y(t)) has an asymptotic distribution function]. But it is easy 
to construct a pair (a(t), y(4)) which satisfies (IIT) without satisfying (II). 
Actually, the pair will be chosen periodic in /; so that (1) reduces to 


1G, A, Hedlund, Annals of Mathematics, vol. 40 (1939), pp. 370-383. 


= 


654 PHILIP HARTMAN AND AUREL WINTNER. 


(IV) A(u, v) exp i{ux(t) + vy(t) }dt, 
if the period is 1. 


First, define a function L of two real variables u, v by placing 
IL (u, v) = 1 -f- ei(urv) e7t(urv) + Qeiu +. 


Then an easy calculation shows that the functional equation (IIT) is, and 
that (II) is not, satisfied by A = L. 

On the other hand, the function Z(u,v) is a trigonometric polynomial 
in which the coefficients of the exponentials are positive and have the sum 1. 
This means that L (u,v) is the Fourier-Stieltjes transform of a 2-dimensional 
purely discontinuous distribution function (with a finite number of jumps). 
Hence, it is clear that one can choose on the interval 0 = ¢=1 two step 
functions a(t), y(t) for which A = L satisfies (IV). 

Incidentally, the trigonometric polynomial L(u,v) is seen to satisty 


the symmetry relation L(u,0) =L(0,u). This means that the two functions 
a(t), y(t) have the same distribution function. 


QUEENS COLLEGE, 
THE JOHNS HopKINS UNIVERSITY. 


tl 
F 
m 
es 
a 
T 
al 
tl 
a 
0 
i] 
St 
( 
{i 
t 
i 
a 
0 
0 
i 
( 


ON AN ASYMPTOTIC FORMULA FOR THE FOURIER TRANS. 
FORMS OF DISTRIBUTIONS ON CERTAIN CURVES.* 


By KE. K. HAvILANp. 


The smoothness of infinite convolutions of the type occurring in the 
theory of the Riemann zeta-function has been treated by an estimate of 
Fourier-Stieltjes transforms of the distributions on convex curves. An earlier 
method? of obtaining such an estimate consisted in an extension of the usual 
estimate of the Bessel functions J,, making use of a lemma of van der Corput 
and an assumption that the spectra are sufficiently smooth convex curves. 
The resulting estimate has then been refined? in such a way as to vield an 
asymptotic formula also. In the case where merely an appraisal is desired, 
the foregoing method has been superseded by a simpler and more general one,* 
quite elementary in nature, which is free of the restrictions of dimensionality, 
analyticity and convexity imposed by the earlier method. This latter method 
(oes not, however, admit of obtaining an asymptotic formula, and it is the 
purpose of the present paper to obtain such a formula without the restriction 
of convexity and with fewer restrictions on the smoothness of the curves. The 
increased generality is obtained largely by following a method of Hartman * 
for obtaining an asymptotic formula for exponential integrals. 

Let y=y(o), where 0=  < 2x, be a parametric repre- 
sentation of a*Jordan curve, 8, to be described more precisely below, in the 
(x, y)-plane. Let o=o(L) be an absolutely additive set function defined, 
for every Borel set, 1’, of the (x, y)-plane, by setting o(2’) equal to 1/27 times 
the linear measure of those @ for which («(¢), y(@)) is contained in FN, 
if F is any open set in the plane. In particular, it is seen that S is the 

* Received November 22, 1939. 

Cf. A. Wintner, “ Upon a statistical method in the theory of diophantine approxi- 
mations,” American Journal of Mathematics, vol. 55 (1933), pp. 309-331; B. Jessen 
and A. Wintner, “ Distribution functions and the Riemann zeta-function,” Transactions 
of the American Mathematical Society, vol. 38 (1935), pp. 48-88. 

2Cf. FE. K, Haviland and A. Wintner, “ On the Fourier transforms of distributions 
on convex curves,” Duke Mathematical Journal, vol. 2 (1936), pp. 712-721. 

Of, A. Wintner, “ On the smoothness of infinite convolutions of the type occurring 
in the theory of the Riemann zeta-function,” American Journal of Mathematics, vol. 61 
(1939), pp. 231-236. 

* Cf. Philip Hartman, “ An asymptotic formula for exponential integrals,’ American 
Journal of Mathematics, vol. 62 (1940), pp. 115-121. 


655 


656 E. K. HAVILAND. 


spectrum ® of a. From the definition of Lebesgue and of Radon integrals, 
it follows that 


(1) “exp [i(ue + 29) 


On setting u=rcosy and v —rsiny, one obtains 


(2) A= A(reosy,rsiny;a) = xf [irh(; 
where 
(2a) h($,¥) 2($) cosy + y(¢) sin y. 


It will be assumed that 


(i) a() and y(¢) possess second derivatives of bounded variation; 


(ii) h’(o;W) has, for any fixed y, exactly n zeros on the curve S and 
these zeros are all simple. Furthermore the zeros of h”’(¢;w) are all simple® 
and nin number. Here and in what follows n is a fixed positive integer and 
a prime denotes partial differentiation with respect to ¢. As a consequence 
of (i), h”(¢) =2"(¢) cosy + y’(¢) sinw is continuous on the torus 7: 
< 27;0Sy< As a consequence of (ii), the zeros of h” sepa- 
rate those of h’. Thus the convex curves previously treated? are included as 
a proper subclass of the curves S now considered. 


Under the foregoing assumptions, it will be shown that 


+ [—h’ 3 exp [i(rh (W) 3 — 2/4) 1} + 0(r?), 


where the o-term holds uniformly for all y, and 


represent the zeros of h’ on S, the former corresponding to maxima of h and 
the latter to minima. It will be observed that the o(7-*) of the present paper 
replaces the O(1t) of the previous treatment, so that we now get precisely 
an asymptotic formula without a remainder term. 

The proof of (3) proceeds as follows. First, the minimum distance 
between a zero of h’ and a zero of h” has, for reasons of continuity, a positive 


5 For the definition of the spectrum, cf. A. Wintner, “ On the addition of independent 
distributions,” American Journal of Mathematics, vol. 56 (1934), pp. 8-16. 

° (ii) might be generalized, under suitable assumptions, to the case where the 
second derivative has multiple zeros. 


LE 


t) 


FOURIER TRANSFORMS OF DISTRIBUTIONS ON CERTAIN CURVES. 657 


lower bound independent of y. Let dx (k= 1,°°-, n), be the 
yeros of hk” (h@;w) and let them be so situated that 

Finally. let 2x numbers 7% be so chosen, as indicated more precisely below, that 

$1 <m < $2 < << hon < gon < hi + 2a. 

As h”(; yw) is continuous on the torus 7’, it is clear that, if we write 

= = + o (dx ; (k 1,2 ,2n), 
then »(¢:;¢3;y) will possess the same property, so that to a preassigned 
«> 0 there corresponds a 8 == 8(e), independent of ¢ and of y, such that 

| <e 
for all @ such that | ¢— dx | < 6. 

Moreover, it is clear from (ii) that there exists a positive constant « such 
that > @ for every y. Then one may choose 7, = so that 
it lies between ¢, + £/3 and ¢, + 2¢/3 for all y, where £=min (Z, 8(¢/4) ) 
and is therefore independent of y, and a similar choice will be made for the 
remaining 747s. 


From (2), 


1 7m N2 gs 
fy Ne Nen-1 2n 
J, + Js Js +-- + J + 
say. 


These integrals fall essentially into two classes: those, such as J;, in 


which the integration range possesses an end point 
and those, such as J2, with an integration range containing a point ¢ux, 
== 1.2.- +.n), in its interior. In order to treat J,, set for ¢,5 = $= m 
where ¢; = and m =m(¥), 

(5) = h(pi;~) —h(o;y) 


for every fixed wy, corresponding to the fact that ¢: is a simple zero of h by 
assumption (ii). On taking the positive square root, 

(6) =| —h( 

so that as @ increases steadily from ¢; to m, the variable ¢ increases steadily 
from zero to a quantity = | 3¥) —h(m() 3 which has 
a positive lower bound f independent of y in virtue of (ii). Hence ¢ is in 
[¢:(~).:(w) ].a monotone, continuous function of ¢. 

Moreover, if a dot represents partial differentiation with respect to /, 


658 E. K. HAVILAND. 


(7) = — if tSa(y), 
so 
ay(y 
The integral in (8) is of the form 
(9) f(t) exp 
where 
(10) f(t) =f(ts¥) sy). 


It is known * that the integral (9) can, for every fixed y be evaluated asymp- 
totically under the assumption that f(¢4) = f(t; yw) is of bounded variation in 
[0,a,(y~)]. That the function (10) possesses this property may be seen as 
follows: 

Applying Taylor’s Theorem with the integral form of the remainder, 
one obtains 


(11) N(psy) =h(o) =h(gi) + (6 — 


where ¢; = ¢:(W). Since h’(¢,) =0, (6) becomes, after a change of inte- 
gration variable in (11), 


(12) t= {— (¢— s)h” (s) 


Similarly, we have 


(13) Ww = f  W"(s) ds. 


¢ 
Since by hypothesis h”(¢;~) =h”(¢) is, for all fixed y, a continuous func- 
tion of ¢ and since h”(¢,) 40, we may write, as above, 

($1) + 0(¢)3 | o(g)| <e if < d(€). 
Then (12) becomes 
= — 0(s) as}! 


and (13) becomes 
= (6 — (41) +f w(s)ds. 
Substituting these values into (10), we obtain 


(p 


( 
W 
( 
( 
sO 
al 
( 
of 
() 
| 
Vi 
r¢ 
p 
( 
| 
( 


FOURIER TRANSFORMS OF DISTRIBUTIONS ON CERTAIN CURVES. 659 


(id = {— $hi” — Ly1}4/{hy” + 
where 
(16) Lip=(b— ($—s)P0(8) ds, 
j 


Now the Ly», (py =0,1), are, for fixed y, continuous functions of ¢@ for 
<<¢Sm(y), and 
(17) |Lip|S (o— oi)" f |w(s)|ds<e, if O0< | < 8(e), 

hj 


so that the are, for fixed y, continuous functions of ¢ in 
also if we define Lj,—0 for ¢=4@;. Furthermore, as pointed out above. 
§=6(e) in (17) is independent of wy and of ¢; on 7. By virtue of the 
choice of 4, and of the existence of the quantity’ a, it follows that for all 
¢,(¢: and for all y, 

(18) | hi” + Lio | > @/2 and | $hy” — Iy, | > @/4. 

From (15), (16), (17) and (18), it is clear that f is a continuous function 

of and since ¢ is, for fixed a continuous function of ¢ in 

)StSa,(y), it is seen that f(t; y) possesses the same property. Then as 

¢—=—2f, it is clear that @ tends to a definite limit as {—> + 0, which 

implies that ¢ exists and (7) holds at 4 = 0 also. 

We now proceed to show that f(4) =f(¢;W), as defined by (10), is a 
function of bounded variation in ¢ for 0St=a,(y). In the first place, 
ifa function f(¢) is of bounded variation in ¢ and @, in turn, is a monotone 
continuous function of ¢, say in [0, a], then f(¢(7¢) ) is a function of bounded 
variation in ¢ in an appropriate interval. Consequently, by virtue of the 
remark following equation (6), we need prove only that f, as a function of ¢, 
is of bounded variation. This we do by using (15) and the following familiar 
properties of functions of bounded variation: 

(z) The product of two functions of bounded variation in an interval [a, }] 
is a function of bounded variation there. 

(8) Ifafunction /'() is of bounded variation in [a,b] and if F(2#) >y >0 
there, then (7) )4 and (F(a))~ are of bounded variation there. 

(y) The product of two positive monotone non-decreasing (non-increasing ) 
functions is again a positive monotone non-decreasing (non-increasing ) 
function. 

(8) a(*—b) + is monotone increasing (decreasing) if a>0(a< 0). 

(e) The mean value over a finite interval of a function of bounded variation 


‘Introduced prior to equation (4). 


660 E. K. HAVILAND. 


(monotone function) is again a function of bounded variation (mono- 
tone function ) 


We now apply the foregoing results to the function F(¢) defined by the 
right-hand member of (14), considering first the second term in the numerator, 
which may be written 


(19) ds = f"0(s) [1 


By. hypothesis, o(s) =o,(s) —wo2(s), where are monotone non- 
decreasing. Since ;, 2 are both bounded, there exists a positive constant, (, 
such that ,(s) + C, w2(s) + are positive for all s in [¢,,4,]. Then the 
left-hand member of (19) may be written in the form 


(20) (w,(s) + C)ds 


(¢— ¢,)* (w2(s) C)ds 


f (o2(s) + C)(s—¢1)ds 


M, and M, are monotone by virtue of (e). The integrand in Mz is the 
product of two functions of ¢ non-negative and monotone non-decreasing in 
[¢:1,:]. Now if two functions F(a), are non-negative and monotone 
non-decreasing in an interval [a,b], then not only are 


monotone non-decreasing functions of x there, which is a consequence of (7) 
and (¢), but, in addition, as may be proved by using the First Mean Value 
Theorem, = where x(x) is again monotone nol- 
decreasing. If we identify with o,(¢) + and F.(x) with ¢—¢: 
then M, corresponds to Mt,. and we have 


8T.e., if F(x) is of bounded variation in [a,b] and aSé=b, then (6) 
= (€—a)" F(«)dx is of bounded variation in & The proof in the case of 4 


a 
function of bounded variation is readily obtained by decomposing F(a) into It 
monotone components. 


| 
fu 
{ mé 
ya 
in 
| en 
| by 
| tio 
ha 
bo 
n 
Se 
(2 
Fr 
wh 
Co 
(2 
| wh 
va 
(2 
if 
wh 
i 


FOURIER TRANSFORMS OF DISTRIBUTIONS ON CERTAIN CURVES. 661 


Vonsequently, (¢ —¢,) 'M.(¢) = and is a monotone (non-decreasing) 
function of ¢, and in the same way the monotone character of (¢— ¢,)*M,(¢) 
may be established. Since the sum of a finite number of functions of bounded 
yariation is a function of bounded variation, it follows that the second term 
in the radicand in the numerator of (14) is of bounded variation in ¢. 

That the second term in the denominator of (14) is of bounded variation 
in ¢ follows immediately from (e€). From (18) and (), it follows that the 
entire numerator of (14) is a function of bounded variation in ¢. Finally, 
by virtue of (18), we may apply («) and (8) and infer that (14), as a func- 
tion of ¢, is of bounded variation in [¢1, 7 | = [¢:(W),(W)]. Hence, as 
has been pointed out, f(t) —f(t;w) is, for fixed w in [0, 27], a function of 
bounded variation in in (OStSa,(y)). 

We now write 

In view of (15), this may be written 
f(t) — —(— 
+ | { (hi)? + 2h,” + hy” + Ly0|/(— + 
—f(+0) + 

In the first place, we observe that | f(0)| = (2), uniformly for all y. 

Secondly, @(¢:y) may be rewritten in the form 

9h” — 2hy” Lip — 10 
(— + ({ (hi)? 4+ — — Dio) 
From (18) it then follows that the absolute value of the denominator in (22) 
is not less than (2a)2(a?/4) = a°/*/2%/2 > 0 uniformly with respect to ¢, 
while in the numerator h,” is uniformly bounded with respect to y and 
Lijp|—>0 as t+ 0, uniformly with respect to y, as appears from (17). 
Consequently, 


(23) | @(t;y~)| if <4,(e); ie, if OSt < 8(e), 


(22) = 


where 8, 8, are independent of yw. 

We now prove, following the method of Hartman, that 

It f(t) =f(t;¥) =f(4+ 0) + ®(t;y), where @(¢;y) is of bounded 
variation in (OS ¢=a,(w)) and 0) =—0, then 


(24) exp [— irt?]dt = 0)1(4) (ir)? + o(r4), 


where the o-term holds uniformly for all y. 


_| 


662 E. K. HAVILAND. 
Proof: 


(25) f sy) exp [— (+ 0) exp [— irl? ]dt 
ra fey) exp [— irt?]dt = Ia + Ib. 


0 
Now #4 


[— irt’]dt — f(+ 9) fue? [— irt?]dt 
J 0 Ja) 
= 3f(+ exp [— m/4] — f(+ 9) [— irt?]dt. 


But | f(+ = (—2h.”)* S (2a)+ for all y. Furthermore, 


+00 
(26) exp [— irl? |dt | < Const./r. 
a(y) 


+00 
For G2(7r) exp [— iy]dy exists and is in virtue of the Second 
r 


Mean Value Theorem applied to a finite interval. On, setting rl? = y, the 
integral in (26) becomes, up to a constant factor, 


4G, ]?) = since a,(W) > B forall y, 


where £8 is the constant defined, above following equation (6). Consequently, 
by (21) and (26), 
(27) Ta = — }(— (a) 4 exp [— tr/4] + O(07'), 
where the O-term is independent of y, in the sense that in absolute value it 
is not greater than const./r, where the constant is independent of wy. 

It therefore remains to consider [b, where ®(¢; y) is of bounded variation 
in (O=t=a,(p)) and 0;y) uniformly with respect to in the 
sense that 


| <e forall ,0S1<8,8= independent of y. 
We next define the non-increasing function m(r) by 
(28) m(r) =lu.b.| ®(t;y)| for 0< tS17;0SyS 2x, 


so that m(r) > 0 as r>-+ o. Since we are interested only in very large 
values of r, we may always suppose 0 < 17? = B <a,(W) for all y. Let A(7) 
be a non-decreasing function of + which becomes infinite with 7 so slowly that 


(29) (A(r))*]A(r) > 9, (r> + %). 


In particular, we may let A(r) = min (77/4, (m[r'/4])-4). Now 


i 

| 

| 

i, 

if 


FOURIER TRANSFORMS OF DISTRIBUTIONS ON CERTAIN CURVES. 663 


ay? 
(30) exp [— irt?]dt = (12; y)t4 exp [— irt]dt. 
0 


0 


Consider the last integral from 0 to b, where 0 <b a,°?(p). 


b | b 
(31) y)t* exp [— irt]dt |= m(b-4) j = m(b-*) - 203. 
0 0 
If we place b = 1°'(A(1) )?, the last expression on the right of (31) becomes 
2m[74/A(7) JA(r) = 0(174) 


by virtue of (29). Hence 


(32) &(t2; y)t exp [— wrt ]dt = = ¢(r)r4, 


where | €(")| << if r= R(e), R independent of y, since m(-) is by defini- 
tion independent of w and A depends only on 7 and on m(-). 
In order to appraise f, (12; y)t* exp [—irt]dt, we apply the 
b 


Second Mean Value Theorem to the monotone function ¢+, obtaining 


ea? (y) 
J) 


where it is understood that the Second Mean Value Theorem is applied sepa- 
rately to the real and the imaginary parts of the integral, the notation being 


Now ©(/2;W) is of bounded variation, inasmuch as ®(¢;y) is, so it may be 
supposed without loss of generality that ®(/4;y) is a bounded monotone 
function, whereupon the Second Mean Value may be applied to each of the 
integrals in (33). From (17) and the continuity of h’(¢;y), hence of 
on the torus and consequently for 0StSa,(p) or OS 
where 7 is the ¢ of the right-hand member of (30), and for 0=y < 2z, 
it that L,, and are bounded in uniformly with 
respect to wy. Therefore, from (22) and the remark immediately following it, 


one infers the existence of a constant K such that 
<< K forall 4,0 StS a,7(p), and all y in (0 


Finally. 0 <b < a,2(), so that [a,(w) | S b+ and from (33) it follows that 


1 
(34) y)t4exp [—irt]dt | S = 16h [A(r) = 


664 E. K. HAVILAND. 


where the o-term is uniform with respect to y in the same sense as in (32), 
From (25), (27), (30), (32) and (34), it then follows that 


f f(t; exp [—tré?]dt =— $(— 2h,”r) 4(x)* exp [— in/4] + 0( 1%), 
0 


corresponding to (24). 
Substituting into (8), we obtain 


(35) = 5) exp [i(rh (di 3H) — 2/4) |] 


the o-term being uniform with respect to w. 

To calculate the integral J2, we observe that h’(¢;W) is negative for 
so that h(m() is in this interval steadily 
increasing from zero, and if we set 


t= | h(m(y) —h(o;y) 


¢ increases from 0 to a2(w) = | h(m(¥) 3 as increases 
from m(y) to m2.(y). By the introduction of ¢ as integration variable in J2, 


This last integral is of the form f exp [—irt?] dt, where 
0 


(36) f=f(tsy) 59). 


Just as :(¥) has already been so chosen that £/3 < m(W) —¢il(W) < 26/3, 
one may so choose that £/3 < $3(W) < 2£/3, where (defined 
just above equation (4)) is independent of y. Then from continuity con- 
siderations it follows that h’(¢(t,~);W) >-y. >0 for all ¢ in (OS Sa,(y) 
S (2u)4) and for all y, » being the maximum of h(¢;y) on the torus. 
Since h”(¢;¥) is continuous and of bounded variation in ¢, h’(¢:W) enjoys 
the same properties. Moreover, ¢ is a continuous monotone non-decreasing 
function of ¢, so that h’($(t,) ;w), as a function of ¢ in (OS tS «(y)) 
is of bounded variation in t. Consequently, by (8) and (a), f(t) is. for fixed 
y, a function of bounded variation in ¢. Moreover, f(0;p~)=0/h’ :¥)=9 
for all y, so that, in Jo, f(t; plays the role of f(t; ~) —f(0;y) = ®(/:¥) 
in J,, and, inasmuch as 


(37) lf(tsy)| Stn S 


it follows on the one hand that |f(t;y)|<e for all ¢ such that 
< —8.(e), where is independent of while, on the other hand, 


| 
t 
t 
( 
( 
| 
| 
} 


FOURIER TRANSFORMS OF DISTRIBUTIONS ON CERTAIN CURVES. 665 


there exists a constant K, such that | f(t; ~)| < K, foralltin (0StSa,(y)) 
and all y. By the same reasoning as that used in the calculation of J,, it then 
follows that 

(38) 


where the o-term is independent of y. 

To each zero of h’ of the form ¢x-3, (k =1,---,n/2), there correspond 
two integrals, of which one, like J;, has @4x-; as lower integration limit, while 
the other, like Jyn, has dax-s3(—= sx-3 + 27) as upper integration limit. The 
contribution of order 7* from each of these may readily be shown to be the 
same, viz. 


Similarly, by a slight modification of the foregoing reasoning, it may be proved 
that to each zero of h’ of the form (k = 1, 2,---,/2), there correspond 
two integrals, from each of which the contribution of order 1-3 is 


(40) 3(2ar) Ah” exp [U(rh + 2/4) 


Finally, just as in the case of J2, it may be shown that for each of the integrals 
Jor+, (k =1,--+,), over an interval containing a zero of h”, 


(41) = 0(174), 


where the o-term is independent of y. 
From (39), (40) and (41), we then obtain (3), q.e.d. 


THE LINCOLN UNIVERSITY, 
CHESTER COUNTY, PENNSYLVANIA. 


14 


| 


ON TAUBERIAN THEOREMS FOR DOUBLE SERIES.* * 


By RaLtpH PALMER AGNEW. 


1. Introduction. Let 


S (n 


k 


n 1 n 
= =, Uk On 

k=1 je=1 
denote the sequence of partial sums and the (, transform of a real series 
Su,. <A classic Tauberian theorem states that if on—s and the unilateral 


Tauberian condition nu, < K is satisfied, then s,—s. 


Let 
1 men 
San = > =— Dd Six (m,n = 1, 2,-- -) 
j,k=1 MN 


denote the sequence of partial sums and the C,; transform of a real double 
series Stémn. K. Knopp? has recently proved several Tauberian theorems of 
which his third is the following: 


If omn—>s and (m* + n?)timn < K, then snn— s. 


The “ natural ” question whether this theorem holds when the Tauberian 
condition (m? + n*)umn < K is replaced by the weaker condition mntmn < XK 
was raised and left unanswered by Knopp. 

In §2 we give examples which show that the unilateral condition 
mMNtUmn < K will not serve; that the stronger O-condition mn | tm,» |<K 
will not serve; and in fact that the still stronger set of o-conditions 


lim mn | | = 0 (m = 1, 2,--°) 
(1) lim mn | Um,n | =0 (n = 1, ‘) 
lim mn | Un,n | =0 
Mm, 


will not serve. The sequences dn and en of § 2 are specialized to obtain further 
results of this character. 
In § 3, we show that the situation is the same for many other methods 


* Received December 7, 1939. 

1 Presented to the American Mathematical Society, February 24, 1940. 

°K, Knopp, “ Limitierungs-Umkehrsiitze fiir Doppelfolgen,” Mathematische Zeit- 
schrift, vol. 45 (1939), pp. 573-589, p. 581. Adjustment from Knopp’s subscripts 
0,1,2,- to our subscripts 1, 2,3,. -- is easily made. 


666 


( 

{ 
( 
vi 
if 
i 


eS 


al 


in 


k 


ON TAUBERIAN THEOREMS FOR DOUBLE SERIES. 667 


of summability, including the Cesaro methods of all positive orders and the 
Abel power series method. 
In § 4, we show that the stronger hypothesis that all of the limits 


lm lim omn; lim omn 
X 


exist, the first for each m 1, 2,- - - and the second for each n 1,2,:--, 
together with a Tauberian condition such as (1), implies neither convergence 
nor convergence by rows of Sumn. 

It therefore appears that the double sequence mnum,n does not play, in 
Tauberian theory for double series, a rdle analogous to the rdéle of the simple 
sequence nu» in Tauberian theory for simple series. 

In connection with the examples of § 2, it is illuminating (but not 
essential) to recognize the fact that if Sum,» has bounded partial sums sm,n 
and Smn—>s, then omn—s; and that, irrespective of whether Stm,n has 
bounded partial sums, if 8m,n—s and omn—>o, then so. General theory 
and references to literature covering these points may be found in two papers 
in this Journal.’ It follows that if om,.—>s and it is not true that smn—s, 


then lim Sm,n cannot exist. 


2. Some examples. Let dn be a bounded sequence of real non-negative 
numbers such that Sdn =o. Let en be a sequence of positive numbers for 
which 0 < €, =1. Choose D such that 


0O=d,< D (n= 1,3,3,- + *), 
and let no = 0. 
We define by induction a sequence 


of indices and the terms umn of a series Stmn. For the first step in the induc- 


tion, take k Define for n = 1, 2,3,- by the formulas 
Uxin== Mar < NS 
(2 dn Ve 
) = — 6d n= 
0 otherwise 


where e’, is the lesser of €o;-1 and €2;3 vz is so chosen that 


(3) D < 1+1 + Any + dv, ) < 2D ; 


®R. P. Agnew, “On summability of double sequences,” American Journal of Mathe- 
matics, vol. 54 (1932), pp. 648-656; “On summability of multiple sequences,” ibid., 
vol. 56 (1934), pp. 62-68. 


le 
yf 


668 RALPH PALMER AGNEW. 


nx, is so chosen that 


(4) dy) — ee (dryer + des + Ody) 


is =0 when 6=1 and g=n,—1 but is <0 when 6=1 and g=m;; and 
finally 6 is chosen such that the difference (4) is 0 when g =m and 6= \. 


Observe that <1. Let be defined for n 2,3,--- by the 
formulas 
(5) Usk,n == — Uek-1,n (n 1, 25 3,° 


Successive steps in the induction are obtained by giving k the values 2, 3, 4,--: 
in turn. 

The terms of the series Stmn which we have just defined may be displayed 
in the form 


in which the value of each tm,» which may differ from 0 is represented by an 
zor byay. The definition of & implies that the sum of the z’s in each row 
is 0, and (5) implies that each y is the negative of the x above it. These 
considerations imply that the sequence smn of partial sums of the series 
XUm,n may be displayed in the form 


0,---,0,0,---,0,0,---,0,-- 
(7) 0,---,0,0,---,0,0,---,0 


in which the value of each Sm,n which may differ from 0 is represented by a z. 
The definitions of y%., m, and Um,n imply that 


(8) 0 Sm,n 2D (m, n= 2, 


(9) D < 8x-1,v, < 2D 


‘ 
| 
if] 
| 
i 
| 
MW 


ON TAUBERIAN THEOREMS FOR DOUBLE SERIES. 669 


Hence 
(11) lim inf Sn,n = 0; D = lim SUP Sm,n = 2D 

Mm; 


and therefore lim Sm,n does not exist. 
The fact that 0S sm» S 2D, and that at most n of the terms s;,% in 


the sum 


m,n 


Sj,k 
MN j, k=1 


are different from 0, implies that 0 S omn S 2D/m and hence that omn— 0. 
Our definitions imply that, for each & and n, 


| Urk,n | | Uck-1,n | S 


and since ¢’% is the lesser of e,-; and ex, this implies that 
(12) | Um,n | (m, r= ). 


For the particular sequences 


(13) dn =1/n log (n+ 1); én = 1/n2* 


the series Sum» satisfies the Tauberian condition 
(14) mn | Um,n | S 1/2" log (n + 1) 
while om,n—>0 and the sequence Sm, is bounded and lim sm,» fails to exist. 


For the sequences 

(15) dn = en = 1/n [log (n + 2) | [log log (n + 16) ] 

we obtain the symmetric inadequate Tauberian condition 

(16) mnlog(m + 2)log(n + 2) | | S 1/log log(m + 16) log log(n +- 16). 
Each one of (14) and (16) demonstrates inadequacy of the o-conditions (1). 


3. Other methods of summability. Let a, and bax, n =1.2.3.°°°, 
denote matrices of regular simple-sequence transformations 


~~ 

( 1 ‘ ) > An kSk > Dn 
k=1 


and let the matrix a,x satisfy the additional condition 

* An example of a divergent series Su,,,, which is summable @,, and which satisfies 
the condition mn | u,, | < K and the condition mnu,,,, 70 as m,n, has just been 
published by W. Meyer-Koénig, “ Zur Frage der Umkehrung des C- and .1-Verfahrens 
bei Doppelfolgen,” Mathematische Zeitschrifi, vol. 46 (1940), pp. 157-160.—Added in 
the proof. 


670 RALPH PALMER AGNEW. 


(18) lim lu.b. | | =0. 


n—0O k=1 
This condition is of course not satisfied when dn, is the identity matrix 
8n,, but it is satisfied for many other regular matrices. In particular, (18) 


is satisfied when da, is the matrix 


on\n+r—1/\n+r—2 n+r—k 


== () k>n 


of a Cesaro transformation C, whose order r is a real or complex number 


with a positive real part 7”; for 


\n +r —i1/\n+r—2 7 


when 1 =k = 7 s0 that if 7”? = 1, 


ja |S |r|/n (k=1,2,--°), 
andif0<r <1 
ja? | S[r 7) (k= 1,2,- -). 
It is well known that C; is regular when 7’ > 0. 


Let A © B denote the double sequence method of summability defined by 


m,n 
j,k=1 


Let Stem, and Sm,n be as constructed in §2. Then | sm,n|< 2D so that 


the series in (19) converges absolutely ; hence 


@) 

whe 

One Am, 585,00 
k=1 j=l 


For each k there is at most one 7, say Bx, for which sj,540. Hence 


(21) 2 Onin, 
so that 
00 
(22) = » Bn, x | [ u. b. | Gin,é | 2D | 
and therefore 
9 
(23) lim = 9, 


Thus Stm,, is summable A © B to 0, and the examples of § 2 apply to.A © & 
as well as to C;. In particular the examples apply to the Cesaro transformation 


to 


anc 


( 
( 
th 
fu 

(2 
th 
|_| 

or. 
(2 
not 
(2: 
Th 
he 
che 


ON TAUBERIAN THEOREMS FOR DOUBLE SERIES. 671 


C,© Cs if the real parts of r and s are positive. In case r—=s~—1, C-+OCs 
becomes the special double sequence transformation C, previously considered. 
It can be shown in the same way that if 
~ X 
(24) o®(t) = > az (t)&; (¢) = by 
k=1 


are regular sequence-to-function transformations and 


(25) lim lu.b. | a.(t)| =9, 


t—>tg hk=1, 2, 
then each series Sum,» of §2 is summable to 0 by the double sequence-to- 


function transformation 
(26) (t,u) = 
k=1 
This applies to the Abel power series method for which 
(27) a(t) — — 2), 
the variable ¢ approaching 1 over the real interval 0=¢< 1 or over the 
complex sets of Stolz and Pringsheim. 
4, Convergence by rows. A double series is called convergent by rows 


to Sp if 


al 
> = Um,n = SR 
m=1 n=1 


or, What amounts to the same thing, if lim, _,. 8m,» exists for each m = 1, 2,--- 
and 
(28) lim lim Sin,n = Sp. 

The series constructed in §2 converge by rows to 0; hence the examples do 
not preclude the possibility that omn»n—s and the Tauberian condition 
mNUmn < K may imply (28) or at least the weaker condition 


(29) lim lim inf smn == lim lim sup Sm,n = 8. 


This question and others are settled by the following example. 


Let dn, en, and D be given as in $2; and for each k = 1,2.-- let 
be the least of the four numbers €x-3, aNd For each = 1,2,°--, 


choose n; such that 


D<ek(di + dn) < 2D 


and Jet 


672 RALPH PALMER AGNEW. 


= 
=0 n> n,. 
The sequence 8m,n of partial sums, and the transforms by various methods of 


summability, of this series are more complicated than those for the series of 
§ 2. However it is possible to show that Sum,» satisfies the Tauberian condition 


| Um,n | = 


that — 2D 5 smn ZS 2D; that if om,» is as before the C, transform of Stm,», 
then 


(30) lim Om,ns lim Om,ns lim Om,n 
n->0O m,n? 
all exist, the first for each m —1,2,--- and the second for each n = 1,2,---; 


that Stm,n fails to converge; and finally that each row of the series Stnn 
converges but that the series of values of the rows does not converge. 

This example is of interest because existence of the first limits in (30) 
and the Tauberian condition mntm,»< K imply (by iterated use of the 
Tauberian theorem for simple series given in $1) convergence of each row of 
Sum,n. The example shows that existence of all of the limits in (30) and 
stronger Tauberian conditions | n,n | S €mdn do not imply convergence of the 


series of values of the rows. 


5. Conclusion. It is sometimes desirable to have, in addition to a 
proof of a result, a plausible argument which indicates roughly why the 
result may possibly hold. The question here is “why” om n—>s and 
< K can fail to imply as on and nu» < K imply s. 
‘The “ answer ” seems to be that the condition mnttmn < AK does not prevent 
an effective dilution of a double sequence s,,,» by insertion of zeros in the two 
dimensional pattern, while nu, < A’ does prevent an effective dilution of a 
simple sequence s» by insertion of zeros in the linear pattern. 


CorRNELL UNIVERSITY, 
IrHaca, NEw YORK. 


en 


ple 
pp. 


where 
- 
fo 
ea. 
ha 
bo 
its 
(2 
th 
fo! 
(3 
Fi 
CO 
de 
(4 
de 
fyi 
ver 


ANALYTIC FUNCTIONS AND MULTIPLE FOURIER INTEGRALS.* 


By W. T. Martin. 


Introduction. In the first part of this note we consider the class E of 
entire functions f(2:,° which satisfies relations of the form 


-& 


for all finite values of y;,° - -, Yn where a and A are positive constants. It is 
easily shown that this class of functions is identical with the class of functions 
having Fourier transforms $(w,,° which vanish outside 
a certain finite region. Next if we denote by K the common part of all convex 
bodies (in the u-space) in whose exteriors @ vanishes identically and by s(A) 
its supporting function, 
(2) s(A) = => + + Aatin}, Ai, Tee il, 

(u)e 
then we show that s(A) is equal to a growth-function h(A) of f defined as 
follows 


(3) h(A) = f | f + trip, + tAnp) dan. 


p00 


From these considerations it follows that the class E is identical with the class 
considered by Plancherel and Pélya? of entire functions of integrable square 
over the real space y, and that the growth-function 
defined in (3) is equal to the growth-function 


(4) hp(A) = max lim | f(a, + dnp) | 


defined by them. 

In the second section we prove results of a similar nature for the class of 
functions f analytic in the “ octant” In{z} > 0, and satis- 
relations of the form (1) for all positive values 

* Received October 12, 1939. 

‘The idea of considering a growth-function of the sort defined here arose in a con- 
versation which the author had with Professor 8. Bochner. 

*M. Plancherel and G. Pélya, “ Fonctions entiéres et integrales de Fourier multi- 
ples,” Commentarii Math. Helvetici, vol. 9 (1936-37), pp. 224-248; vol. 10 (1937-38), 
pp. 110-163. 

675 


674 W. T. MARTIN. 


1. The class P of entire functions. We consider the class P of func- \ 
tions f representable in the form 
a 
1 n/2 a > ’ 

? Qa Ou 

where (wu) is of integrable square over —a <u <u, 
n 

k=1,:-+,m, and do, is the volume element du,- - - du», and we show that 
this class is identical with the class E of entire functions satisfying relations 
th 


of the form (1). First, by the Schwarz inequality, 


a 
| J f ~itnen dy, |? 
-a ¢ an 


a a 
= f f | |? dow f f +Ynthn) Jory (: 


and thus the function f defined in (5) is an entire function. Next by 
Plancherel’s theorem 


oo a 
f | f(a + 14/1, >Un + 1Yn) f f | |e? *Yntln) du, 
f f | |? don, (1 
—a ¢ 


and thus a relation of the form (1) holds. Conversely, if f belongs to the 
class E, then for each (y1,---,Yyn) it has a Fourier transform y,,)(w). 


le 

By a theorem due to Bochner,* since the left-hand side of (1) is bounded of 
) 

for (y) in any bounded region, it follows that yYy,(w) has the form 7 
‘ 


Un) Thus by Plancherel’s theorem 


1 n/2 00 (1 


We next show that if f has the representation (6) and if (1) holds then (1 
¢ = 0 outside the “cube” C[—a<u,<a, k=1,:--, n|. For suppose 
¢ #0 in some region Pf which lies outside C. There is no loss in generality 
in assuming that is of the form < uj < Bee where 
a<%< fi. Then for +, positive Plancherel’s theorem yields 


e « e an 


R. 


On f 

wer 


an Ey 
°$. Bochner, “ Bounded analytic functions in several variables and multiple he 


Laplace integrals,’ 
esp. 733-734. 


American Journal of Mathematics, vol. 59 (1937), pp. 732-738; 


7 

— 


Wy 


ANALYTIC FUNCTIONS AND MULTIPLE FOURIER INTEGRALS. 675 


As > for *,%n fixed and positive, this contradicts (1) since 
| lo 
a< %, and f, | & |°dow > 0. Thus we have a contradiction and hence ¢ = 0 
R 


outside C. Hence the two classes P and E are identical. 

Next let f belong to the class P (=£E) and let us denote by A’ the 
intersection of all convex bodies in the u-space in whose exteriors @ = 0, 
and let s(A) be the supporting-function of K defined as in (2). Then f has 
(8) f (415° -(5 

2a 


n/2 
K 
and 
. 
(9) f | f(a + + dnp) |?dox 
-00 e 


K 


< f | |?dou. 
Jk 


the representation 


Thus 


co 
(10) =limsup log f f | + tAnp) |?dwr S 8(A). 
-xX 


In order to see that the actual limit in (10) exists and is equal to s(\) 
let us consider a fixed direction (A°). Then there is an extreme point? (w’) 
of K such that s(A°) =A,°u,° +: - -+A,°u,°. Moreover for 6> 0 there 
clearly exists a neighborhood N = N(8) of (u°) such that 


(11) s(A°) —8S S 8 (A°) + 8, for (u) NK. 
Hence 


00 
(12) f | f(a, + -t- p ) 


e 


. 
K 
NK 


‘ By an extreme point of a convex body K is meant a boundary point which is not 
an inner point of any line segment of K. For each direction (A) there is an extreme 
point which lies on the supporting plane in that direction, i.e. on Aju, ++ + + + A,U, 
=8(). An extreme: point also possesses the property that if any neighborhood N 
of it is omitted from K, then the convex extension of AK — NK is a proper subset of K. 
For these properties see T. Bonnensen and W. Fenchel, “ Theorie der Konvexen Kérper.” 
Ergebnisse der Math. und ihrer Grenzgebiete, Berlin (1934), esp. pp. 15, 16 or G. Pélya, 
“Untersuchungen iiber Liicken und Singularitiiten von Potenzreihen,” Mathematische 


Zeitschrift, vol. 29 (1929), pp. 549-640, esp. pp. 573-578. 


676 W. T. MARTIN. 


Now | |"dwy > 0 since otherwise ¢ would be identically zero in NK 
NK 


and thus ¢ would be identically zero in the exterior of the convex body K* 
which is the convex extension of K — NK. But this is impossible since K* 
is a proper subset of K (see *) and this contradicts the definition of K. 


Hence (12) yields 


(13) lim int “tog + = 8(A°) —3, 


From (10) and (13), since (A°) is an arbitrary direction and 6 is an arbitrary 


positive number, it follows that the limit in (10) exists and that it is equal 
to s(A). 
We have proved the following theorem. 


THEOREM 1. Let f(2:,- + -+,2n) be an entire function satisfying (1). 
Then the limit in (3) exists and is equal to s(A), where s(Xr) 1s the supporting 
function defined by (2) of the conver body K, where K is the intersection 
of all convex bodies in whose exteriors the Fourier — of f is identi- 
cally zero. 


Plancherel and Pélya (loc. cit.*) have considered the class P of functions 
and have shown that the growth function hp(A) defined by them as in (4) 
is equal to the function s(A). Thus we have 


CoroLtary. If feP then 


(14) Lim f | f(a + trip, ) 


¥ pc P 


max Lim sup log | + tip, Qn + iAnp) |. 


In connection with Theorem 1, let us remark that Plancherel and Polya 
(loc. cit.*, p. 146) have shown that 


e 
= fc f | f(x,° -dwy 
-00 


where ¢ is the cardinal increase of f, that is, c is the. greatest value of /p(A) 
for (A) ranging over all the sets for which one Ax is + 1 and all others are ¥. 
They also obtain an analogous result for the class L”. 


th 
fo 


to 


Tl 
Ne 
sa’ 
tré 
k = 
re] 
Us 
wl 
| 
| 
of 
(2 
wh 
els 
the 
|_| 


ANALYTIC FUNCTIONS AND MULTIPLE FOURIER INTEGRALS. 677 


2. Functions analytic in the ‘‘octant’? Q = H[Im{u} > 0, k =1, 

Let +.2n) be analytic in @ and let it satisfy a relation of 

the form (1) for all positive values of y:,- - *,yn. We shall obtain a result 
for this case analogous to that obtained in the previous section. Define 


Then by (1) (for positive) 


f | + tm, ° + tyn) 
A; 


Now Bergmann and Martin *® have shown that a function g analytic in Q and 
satisfying a relation of the form (16) for (4, - -, 4) positive has a Fourier 
transform which vanishes outside the octant = < 0, 
k=1,---,n] and which has the property that y(w) eZ. Thus g has a 


representation of the form 


Using (15), we see that 


n/2 a 
Qr 
where 


(19) (14, ° . Un) = y(ui —a). 


Thus the class of all functions f analytic in Q and satisfying relations 
of the form (1) for all positive values of y;,° * -, Yn, is contained in the class 


of functions defined by 


1\"2 « 


where L? over — and vanishes identically 
elsewhere. That these two classes are identical follows at once. We omit 
the details. 


°S. Bergmann and W. T. Martin, “ On a modified moment problem in two variables,” 
to appear in the Duke Mathematical Journal. See esp. Theorem 1. 


= 


678 W. T. MARTIN. 


Next let Sp be a closed sphere of radius p, center = = 0, 


and let K, be the intersection of all convex bodies C in Sp such that ¢=( 
in S,—C. Then clearly K,C K,,, and the point set K’ consisting of all 
points in any Kp, p=1,2,- - -. has the property that ¢ = 0 outside A’. Thus 


1 
(21) f(21,° =(4) (z) €Q, 


and for positive *,An we have 


(22) J ++ + tAnp) |?dwz 
- - < e28' (A) p f, | 
K’ 


where 

23) s’(A) = 1. u. b. ++ Antn} An positive.® 
(uyeK 

Hence 


(24) sup f | + tArp,* + tAnp) S 8’(A) 


for positive A’s. Again we can show that the actual limit exists and that 
it is equal to s’(A). For this purpose let us apply Plancherel’s theorem to 
(21). Then 


Kp 
Now by Theorem 1 we have 
1). 1 292 (Aquat +AnUn) p 


and thus 


(27) lim int f(a. + trip, + iAnp) = 8p(A), 
(p= 1,2,° . 


for positive. This implies that the left-hand side of (27) 18 
greater than or equal to s’(A) for positive A’s. For let (A°) be a_ positive 


*It is clear that K’ C <u, <a, K=1,...-,n] and hence that 


leu. b. fr, u, +.. -+A,U,} 5 (A, +---+A,)q@ for d,,- -,A, positive. 
(u) €K’ 


t] 


01 

al 

e¢ 

fe 


ANALYTIC FUNCTIONS AND MULTIPLE FOURIER INTEGRALS. 679 


direction. Then in view of the definition (23) of s’(A) there is a sequence (wu?) 
of points such that (wu?) « Kv,, (where v»—> «© as p—> o) and such that 


The relation (28) together with the fact that A,°u,? + S Sy, (A°) 


gives 


lim inf s,,(A°) 2 s’(A°), 
p00 


and hence the left-hand side of (27) (for (A) = (A°)) is greater than or 
equal to s’(A°). Since (A°) is an arbitrary positive direction this result 
together with: (24) yields the following result. 

THEOREM 2. Jf f is analytic in Q and tf (1) holds for positive y,,- ++. yn 
then 


lim Log f | f(a, + tAnp) |?dor = 8’(d) 
e e 


poo 


for positive r’s, where s’(X) is defined as in equation (23). 


THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 


(), 
() 
is 


PROJECTIVE ANALOGUES OF THE CONGRUENCE OF 
NORMALS.* 


By O. BELL. 


1. Introduction. The projective normal at a point of a non-ruled sur- 
face S in ordinary space was defined by Fubini as the cusp-axis of y with 
respect to the extremal curves of his integral invariant 


f (abv’) #du. 


It is well known that the pseudo-normal which Green proposed as a projective 
analogue of the normal coincides with the projective-normal. 

Green and Fubini discovered, quite independently, certain analogies which 
exist between this line and the normal. Green noted that the projective 
normal, like the normal, is intrinsically connected with the surface, and that 
the curves which correspond to the developables of the projective normal con- 
gruence resemble the lines of curvature by also forming a conjugate net. 
Fubini’s considerations reveal that both normal and projective normal may be 
defined as cusp-axes of certain integral invariants. Fubini’s definition lacks 
geometric significance without a geometric interpretation for his integral in- 
variant. The author [1, p. 403]? has recently provided such an interpretation. 

Green designated a congruence whose developables correspond to a con- 
jugate net a conjugate congruence. Grove [2] has proved analytically the 
existence of a class of covariant conjugate congruences a general one of which 
he calls an R-conjugate congruence. He does not however characterize geo- 
metrically any one of these congruences. It is the purpose of this paper to 
present a method for the geometric determination of a general /-conjugate 
congruence and to show that it is also characterized by the other important 
property of the projective normal by being similarly determined with respect 
to the extremals of an integral invariant. A method will also be given for 
the geometric interpretation of these extremals. Finally certain special f- 
conjugate congruences will be introduced. 

Let the surface S be referred to its asymptotic net as parametric, with the 
fundamental differential equations in Wilczynski’s canonical form 


(1. 1) Yuu + + fy=0, + + gy = 0. 


* Received November 10, 1939. 
1 Numbers in brackets refer to the bibliography at the end of the paper. 


680 


V 
The 
whe 
line 
curs 
in t] 
nate 
whic 
equé 


of V 
(2.1 


Mak 


(2. 
whe! 


Let 

of 
is 
poin 


and 


joins 


Us 
CO 
(i, 
wh 
defi 
sim 
poi 
kin 
rec 
sec 
cové 


PROJECTIVE ANALOGUES OF THE CONGRUENCE OF NORMALS. 681 


Using the notation introduced in the celebrated memoir by Green [3] let us 
consider the parametric vector equations 


(1.2) y= y(u, v), p = Yu — By, — ZY, T= Yuv — — -++- 


where « and £ are arbitrary analytic functions of u and v. Equations (1. 2 

define the general homogeneous codrdinates of four points which we denote 
simply by y, p, o and + when no confusion can arise. The line / joining the 
points p, «, according to Green’s classification, is an arbitrary line of the first 
kind and generates a congruence T of the first kind as y moves over 8S. The 
reciprocal 1’ of the line J with respect to S at y is an arbitrary line of the 
second kind and generates a congruence I” of the second kind as y moves over 
S. If the functions a, 8 are chosen suitably the points y, p, o and 7 become 
covariant points and the congruences [ and I” become covariant congruences. 


2 Conjugate congruences. Consider any two covariant points o, and 
w, Which are collinear with y but do not lie in the tangent plane to S at y. 
The general codrdinates for , and w, are given by o% ry, ((=1,2), 
where 7 is defined by (1,2) and 7, and r, are functions of u,v. The tangent 
lines at »; and w, to the curves described by these points as y moves along a 
curve (, defined by dv —A(u, v)du = 0, intersect the tangent plane to S at y 
in the points which we denote by W,™ and W.™. Expressions for the codrdi- 
nates of W,™, (¢ 1,2) are linear combinations of and (oi). + A(oi), 
which do not contain yu,. The terms of (wi)u + A(oi)» which involve yu, are 
equal to — (8 + aA) yur. Hence, the expressions for the general codrdinates 
of W{™ and W.™ are given by 


(2.1) W,™ = (wi)u +A(wi)e + (B+ aA)wi, (i= 1,2). 
Making use of the forms for o, and w, and the equations (2.1), we have 
(2. 2) = (yu — By) + A(yo— ], 

where 


B=—(B+ [log R]u), (a+ [log R=rm—n. 


Let t, denote the tangent to Cy at y. Let vy, denote the point of intersection 
of t; and the line joining W,™ and W.. The right hand member of (2. 2 
is clearly the expression for the general codrdinates of v,. We shall call the 
point v, the v-point of ty, corresponding to the points o, and wo». 

Since the right hand member of (2. 2) is a linear combination of Yu — By 
and y,— ay, the point v,, for any value of A, lies on a straight line 1 which 


joins p and & given by 


15 


682 PHILIP 0. BELL. 


P= Yu— By, Yu — 
where 8 and & are defined above. Hence, we have the theorem 


THEOREM (2.1). As the direction Xd is varied, while u and v are held 
constant, the v-point of ty), corresponding to the points w, and ws describes a 


straight line 1. 


The point p of intersection of the line 7 with the reciprocal of the line 


joining », and ws has general codrdinates of the form 


p=(2a+ [log (yu + [log R]uy/2)—(2B + [log (ye + [log R] -y/2). 


Let ¢, denote the tangent to S at y which passes through the point p. In view 
of the forms of the functions 2, 8, we have 

THEOREM (2.2). The harmonic conjugate of the tangent ty with respect 
to the line Land the reciprocal of the line joining o, and wy. is the R-harmonic 
line, which joins the points p and o given by p= Yyut (log F)uy/2 and 


o=y, + (log R)-y/2. The reciprocal of this line is the R-conjugate line. 


To complete the characterization of the R-conjugate line for a given 
function Rk = R(u,v) it is, of course, necessary to have geometric definitions 
of covariant points », and 2. whose general codrdinates are related by the 
equation o, =o, + kRy, k = const. 

The integral f (Rv’)*du, where R(u,v) is associated with covariant 
points ,, w, in the manner described in the preceding paragraph, is an in- 
tegral invariant which is projectively and intrinsically related to the arc of « 
curve along which it is calculated. The extremals of this integral are defined 
by the curvilinear equation 


(2. 3) v’ = (log w’ — (log R) vw”. 
It is well known that if Wilezynski’s canonical form (1.1) is used, the cusp- 
axis of y with respect to a two parameter family of hypergeodesics defined by 


v’ =A+ Br’ + Cv? + Do", 


passes through y and the point z given by z= yur — %Yu— Byr, Where a= C/*, 
B=—B/2. Hence, we have 

THEOREM (2.3). The R-conjugate line is the cusp-axis of y with respect 
lo the extremal curves of the integral invariant { (Rv’)*du. 


To add to the geometric significance of the above theorem the extremal 
curves of the integral f (Rv’)’#du will be geometrically characterized. 


po 


C0 


= 
= 
at 
in 
| 
wl 
wl 
Tl 
su 
Ca 
CO 
al 
ay 
ql 
hye 
of 
li 
hi 
ql 
in 
“ 
T 
ol 


PROJECTIVE ANALOGUES OF THE CONGRUENCE OF NORMALS. 683 


THEOREM (2.4). The langent ty associated geometrically with covariant 
points wo; dnd w, which lie on the cusp-avis of y with respect to a pencil py of 
conjugate nets and the tangent ty of the curve Cy of the fundamental net Ny 
at y are conjugale tangents if, and only if, the curve Cy is an extremal of the 


integral invariant { (Rv’)*du, where kRy = o.— o,, k = const. 
According to the hypothesis we must have 
(2.4) A= (28 + [log R]u)/(2a + [log R]y), 


where B = — (log A). /2 and = (log d),-/2. Hence, on clearing of fractions 
we obtain 
‘be Au + AAC = (log R) ur = (log 


which. on substituting v’ for A and v” for Au + AAr, becomes equation (2.3). 
The operations are reversible and therefore the condition is necessary and 


sufficient. 


3. Special conjugate congruences. ‘The projective normal is the special 
case of the R-conjugate line for which R = ka’b, k = arbitrary const. To 
complete its geometric characterization it is necessary to locate two points o; 
and , whose general codrdinates are related by the equation o2— o, = ka’by, 
== const. Two such points are the intersections (distinct from y) of an 
arbitrary line of the second kind with the quadric of Wilczynski and the 
quadrie of Lie. 

From the standpoint of analytic simplicity the projective normal is the 
best available projective substitute for the normal. From a geometric point 
of view, however, it is quite conceivable that there may be other R-conjugate 
lines equally suitable as a projective substitute for the normal. An -conjugate 
line of this character will be introduced in connection with a new pencil of 
quadric surfaces. 

Let |; denote a general line of the first canonical pencil. The line Ix 
intersects the w and v-tangents to S at y in the points p, o defined in (1.2), 
where 
B = (log a’*b) — (log u/2, 
= (log b2a’),/k — (log a’b) ,/2. 


(3, 1) 


The lines 1, 1, 1, and la are the first directrix of Wilezynski, the reciprocal 
of the axis of Cech, the first canonical edge of Green, and the reciprocal of 
the projective normal, respectively. As y moves over S the points p, o of | 
generate transversal surfaces Np and So of the congruence described by Ix. 


The v-tangent at p to Sp intersects U’;, the reciprocal of i, in the point which 


684 PHILIP O. BELL. 


we denote by m, whose coordinates are given by m—=7r— By, where z. B are 
the functions associated with 1. Likewise the u-tangent at o to So intersects 
l’;, in the point which we denote by & whose coérdinates are given by 
& =1r— ay.” Let & denote the harmonic conjugate of y with respect to the 
points », and & The general codrdinates of & may be easily found to be 
given by (k — 8) (loga’b) wy/2k, where the functions 8 in the 
expression for 7 are given by (3.1). It is well known that just one quadric 
of Darboux at y passes through a given point not in the tangent plane to § 
at y. The equation of the unique quadric of Darboux which passes through 
the point ¢,, / = const., is easily found to be 


(3. 2) — + (kk — 3) (log = 0. 


This quadric is, therefore, a general member of a pencil of quadrics whose 
members are in one to one correspondence with the lines of the first canonical 
pencil. The quadrics of this pencil will therefore be called canonical quadrics. 
The special case of (3.2) for /—=3 is clearly the canonical quadric of 
Wilczynski. Stouffer [4], without introducing the general quadric (3. 2), has 
given the above characterization for the quadric of Wilczynski. 

The intersection of a general line /’ of the second kind with the quadric 
(3.2) is a point, which we denote by ox, whose general codrdinates are given 
by on =7-+ (k—3) (log a’b) wy/2k, where the functions a, B in the ex- 
pression for 7 are arbitrary. The form for the codrdinates of o;, shows that 
oj — = ¢(log wy where c= — 3) — (k — 3) /2k, j, k = const.’s. 
Hence, the following theorem is an immediate consequence. 


THEOREM (3.1). Jf the fundamental points o, and wz are chosen us the 
intersection of a line lV’ of the second kind with two quadrics from the pencil 
of canonical quadrics, the associated R-conjugate line is independent of the 
choice of V and is independent of the selection of the two quadrics of the pencil. 


For this line the associated functions a, B are given by «=— (log f),/%, 
B =— (log R),/2, where R = (log a’b) wr. 


4. R-conjugate congruences associated with one-parameter families 
of curves. 


The transformation 
(4. 1) y = 2/(R)* 


* The points 7, and & are special cases of the points 7, and 7,, respectively, which 
were introduced by Green [3, p. 95]. 


tr 
an 


in 

R 

re 

(: 

Ly 

al 
ne 

( 

W 

( 

pe 

al 
m 
| 

li 

: 
al 


PROJECTIVE ANALOGUES OF THE CONGRUENCE OF NORMALS. 685 


transforms the covariant points (R)#(yu + Ruy/2R), (R)4(yo + Rvy/2R) 
and 


(BR) {Yur + Reyu/2R + /2R + + (log R) w/2]y} 


into ty, and respectively. The points are the intersections of the 
R-harmonic line with the asymptotic u and v-tangents to 8 at x, and the point 
yu; lies on the #-conjugate line and is characterized like a point &, but with J; 
replaced by the #-harmonic line. The effect of transformation (4.1) on 
system (1.1) is to produce the following canonical form 


(4. 2) un = PU + Oulu + Bae, 
Lov = + + 


wherein. 


6 = log B=— 2b, y= — 2’ 


If k =a’b, the form (4.2) is Fubine’s canonical form. 

The intersection of the #-harmonic line with the tangent at « to the curve 
defined by dv—aAdu=0 is the point a, Aa. The tangent plane at 
tu -+ Av, to the ruled surface described by the R-harmonic line as moves 
along Cy intersects the R-conjugate line in a point whose general codrdi- 


nates are found to be given by 
(4, 3) Py = + (p+ 


The T-curves of the R-harmonic congruence form a conjugate net N), 


whose curvilinear differential equation is 
(4. 4) dv? — ,*du? =0, where A, = (p/q)*. 


The points P_y,, Py, associated with the curves of Ny, which pass through the 


point @ are given by 
a= \% 


We recall that a conjugate line may be determined in association with an 
arbitrary line I’ by choosing fundamental points ;, w2 on l’ and following the 
method outlined in § 2. The conjugate line thus determined with respect to 
the fundamental points P_),, P,, which lie on an arbitrary chosen /-conjugate 
line ig especially interesting because of its remarkable analytic, as well as 
geometric. simplicity. By making use of equations (4.2) in carrying out the 


analysis for this determination we obtain the following 


THEOREM (4.1). The R-conjugate line (R = [pq]|*) determined with 
respect to the points = Lue — (pq) a, P), = + (pq) of an arh- 


686 PHILIP 0. BELL. 


trarily chosen R-conjugate line, as fundamental, passes through the points x 
and z= Lu, — ary — bx, where a, b are defined by 


a= [log(R/(pq)*)]r/2, = [log(R/(pq)*) ]u/2. 


This line is the cusp-axis of the point x with respect to the extremal curves 
of the integral invariant 
(pq) 2du. 
Of course, other R-conjugate congruences may be associated with a viven 
one by selecting points P., which are associated with other significant curves 


of 8S. The investigation of some of these may prove interesting. 


UNIVERSITY OF KANSAS. 


BIBLIOGRAPHY. 


(1) P. O. Bell, “ A study of curved surfaces by means of certain associated ruled 
surfaces,” Transactions of the American Mathematical Society, vol. 46 
(1939), pp. 389-409. 

(2) V. G. Grove, “On canonical forms of differential equations,” Bulletin of the 
American Mathematical Society, vol. 36 (1930), pp. 582-586. 

(3) G. M. Green, “ Memoir on the general theory of surfaces and rectilinear con- 
gruences,” Transactions of the American Mathematical Society, vol. 20 
(1919), pp. 79-153. 

(4) E. B. Stouffer, “ A geometrical determination of the canonical quadric of Wilezyn- 
ski,” Proceedings of the National Academy of Sciences (18), vol. 3. (1932), 


pp. 252-255. 


( 
] 
{ 
( 
ii 
A 


CONVERGENCE THEOREMS FOR FUNCTIONS OF TWO 
COMPLEX VARIABLES.* 


By Wit1TMore. 


1, Introduction. The theory of harmonic measure has proved a very 
valuable tool in the theory of functions of one complex variable. The possi- 
bility of these applications is due on the one hand to the fact that the real or 
imaginary part of an a.f.1¢.v. (analytic function of one complex variable) 
is a harmonic function and on the other to the fact that the Dirichlet problem 
can be solved uniquely in terms of harmonic functions, thus assuring the 
existence of the harmonic measure.’ In attempting to carry over these ideas 
to functions of two complex variables, one is confronted by the fact that it is 
not possible to prescribe arbitrary boundary values for a biharmonic function 
(real or imaginary part of an a.f.2c.v.) on the entire three dimensional 
boundary of a four dimensional domain. In order to preserve at least a por- 
tion of the properties of the one variable case, Bergmann (B,) has introduced 
the concept of domains with distinguished boundary surface. The three 
dimensional boundary of such a domain contains a closed two dimensional 
manifold—the distinguished surface (ausgezeichnete Randfliche, surface re- 
marquable)—which has properties for the theory of a. f.2.¢.v. analogous to 
those of the boundary for the one variable case, in that a regular a. f. 2 ¢. Vv. 


* Received September 21, 1939. 
1The method of approach used here has been chiefly developed by Stefan Bergmann 
in a long series of papers, of which I have had occasion to cite five in particular: 
(B,) “ Ueber die ausgezeichneten Randflichen in der Theorie der Funktionen von 
zwei komplexen Veriinderlichen,” Mathematische Annalen, vol. 104 (1931), 
pp. 611-636. 
(B,) “ Zwei Siitze aus dem Ideenkreis des Schwarzschen Lemma bei den Funk- 
tionen von zwei komplexen Verinderlichen,” Mathematische Annalen, vol. 109 
(1934), pp. 324-348. 
(B,) “ Ueber eine Integraldarstellung von Funktionen zweier komplexer Veriinder- 
lichen,” Mathematicheskii Sbornik, vol. 1 (43) (new series), pp. 851-861. 
(B,) “Ueber eine in gewissen Bereichen mit Maximumfliche giiltige Integral- 
darstellung der Funktionen zweier Variabler,’ Mathematische Zeitschrift, 
vol. 39, pp. 605-608. 
(B;) Ueber eine Abschitzung von meromorphen Funktionen zweier komplexer 
Verinderlichen in Bereichen mit ausgezeichneter Randfliche,” Travaur de 
VInst. Math. Tbilissi, vol. 1, pp. 187-204. 
The theory of harmonic measure for one variable is given in Nevanlinna: “ Eindeutige 
Analytische Funktionen,” here cited as (N). 


687 


688 WILLIAM F, WHITMORE. 


attains its maximum on this surface, a biharmonic function is determined by 


its values there, etc. An example of a domain with distinguished surface is 
given by any domain bounded by a finite number of analytic hypersurfaces 
(three dimensional manifolds defined by analytic relations between the two 
complex variables), the distinguished surface being formed by the intersections 
of these hypersurfaces—e. g., the bicylinder | z, | <1, | z.| <1 with a boun- 
dary composed of the two analytic hypersurfaces z,—e#%—=0, | z.| <1 
and = (), | z, | <1 has the distinguished surface | z, |= 1, | z2 | =1. 

Although a biharmonic function is uniquely determined by its values on 
the distinguished surface of a domain, it is in general not possible to find a 


biharmonic function defined in the domain which assumes arbitrarily pre- 
scribed values on this surface. Hence a biharmonic measure cannot be used 
to generalize the notion of harmonic measure, for such a measure may not 
exist. Bergmann (B,) has met this further complexity by introducing the 
notion of functions of extended class. This class possesses properties necessary 
for the extension of harmonic measure; in particular, the property that to 
every bounded, piecewise continuous function given on the distinguished sur- 
face of a domain there corresponds a unique function of the extended class 
defined in the domain, and also that the operator defining the class is linear. 
The class depends, in general, on the domain. For a domain where the range 
of each complex variable is independent of the other (product domain, also 
called cylinder domain), the extended class is known to be the class of doubly 
harmonic functions (Bz), so that for such domains the notion of harmonic 
measure can be replaced by that of doubly harmonic measure. For functions 
of 1c. v., Lindelof has proved a theorem to the following effect (N, p. 44): 

If an analytic function defined and bounded in the upper half-plane 
converges to a limit for z tending to infinity along the negative real axis, then 
it converges to the same value uniformly in each angle-space 7 > argz > 7 > 0. 

Or stated for the unit circle: 

Tf an analytic function defined and bounded in the unit circle converges 
to a limit for one-sided approach along the boundary to a given boundary 
point, then it converges to the same value uniformly along any path in the 
interior which ends at the given point and makes a positive angle with the 
circumference at the point. 

With the aid of the theory of doubly harmonic measure we shall show 
that analogous results can be established on the convergence of bounded 
functions of 2. v. defined in certain domains with distinguished surface. 


p 
( 
A 
as 
s) 
tl 
b 
b 
tl 
dl 
I) 
n 
W 
h 
is 
tl 
i a 
ti 
| e 
d 


CONVERGENCE THEOREMS FOR FUNCTIONS OF TWO COMPLEX VARIABLES. 689 


2. Notation and definitions. We consider functions of the two com- 
plex variables z; = a + tyr (k == 1,2). A doubly harmonic function of the 
four real variables 2, yi, V2, Y2 is defined by the equations 


Pu 


(1) 1, 2), 


A biharmonic function is the real or imaginary part of an a.f.2c¢.v. and 
satisfies in addition to equations (1) the equations 


y 


— 


02,025 dy 


as can be verified by application of the Cauchy-Riemann equations. The 
symbol « indicates the intersection of two point sets; the symbol X indicates 
their topological product. L[---] denotes the set of points satisfying the 
relations enclosed in the brackets. A four or two dimensional domain will 
be indicated by a capital letter and the corresponding three or one dimensional 
houndary by the corresponding small letter. An upper index j attached to 
the symbol for a set gives its dimensionality (0 <7< 4); e.g., 3° is a two 
dimensional set. Let be a domain in the z%-plane, bounded 
by a finite number of Jordan ares gx? (Gx? may be multiply connected). The 
product domain 9% == ©, & @,? is a four dimensional domain in the (4, 22)- 
space. The two dimensional surface X is the distinguished 
surface of Yf. As noted in the introduction, the Dirichlet problem of deter- 
mining a function defined in % which assumes prescribed bounded and piece- 
wise continuous values on %* can be solved uniquely in terms of doubly 
harmonic functions; in the case of a bicylinder, an explicit form for the 
desired function is given by an iterated Poisson integral (B.). Hence, if 33° 
isa subset of 3° having positive two dimensional measure, there exists a unique 
doubly harmonic function defined in % which assumes the value 1 on 3? and 
the value 0 on 3° —%*. This function will be denoted by (21, 223 337. W) 
and is defined to be the doubly harmonic measure of 3° with respect to 
taken at the point 22}. 


8. Convergence in Bicylinders. Using the notion of functions of 
extended class, Bergmann has established for domains with distinguished 
surface a generalization of a theorem given by Ostrowski for the one variable 
case (Bz, p. 344). It will be stated here in the restricted case of product 
domains with the aid of doubly harmonic measure.” 


* Note that in Bergmann’s statement a summation sign is omitted. 


690 WILLIAM F, WHITMORE. 


THEOREM 1. Let f(z,,22) be an a.f.2c.v. defined and regular in a 
product domain A and continuous on the boundary a® of A. Lel the distin- 


guished surface of Ube composed of m disjunct pieces, => We: und 
k=1 


let w(%1, 223 be the doubly harmonic measure of If there exist m 
constants My, (k=1,---,m) such that | f (2, 22) |S My for 22} € Xx. then 
one has in MA the inequality: 


(3) log | f (21, 22) | (log 223 

For the case m = 2 this theorem becomes a generalization of the so-called 
“two-constant ” theorem (N, p. 41): 

THEOREM 2. Let f(2, 22) and A satisfy the h-yvotheses of Theorem 1. /f 
| f (41, %2)|S M on and | f(a, Sm (m <M) on a subset of 


then 


(4) log | f (41, 22) | plogm + (1—p) log M 


at all points of the set {2,2} € 23 3°, > pw, 1 


With the aid of Th. 2, the first of the desired convergence theorems can 


be established. 


THEOREM 3. Given f(Z,22) an a.f.2c.v. defined and regular in the 
closed quarler-space y; = 0, y2=0 (topological product of two upper half- 
planes), with | f(%,%2)| 1 on the distinguished surface y,=y.=0. If 
| f (41, (O<e <1) for {%, 2} = ys =0, +4) + 227 <9] 
where 8 and a are arbitrary positive constants, then | f(a, Zo) | < & for 


(5) {Z1, 22} (8(z, +4) + 2?) >pr, 1 >p> 0]. 
Proof. Apply Theorem 2 with me, M=1. The function 
+ 2x24» 
+a) + — 
is a doubly harmonic (in fact, biharmonic) function which is 1 on 
= y2 = 9, + 4) + < 


and 0 on the remainder of %* and hence is the doubly harmonic measure of S- 

Two remarks can be made concerning this result. First, the proof ca! 
obviously be extended without change to the case where 22” and x.” are replaced 
by 2." and «.”" respectively. Second, in the limiting case where 6 is allowed 
to approach infinity, the parabola 8(2, + becomes the line 


arg +a) + 2?) = arctan 


: 
f 
m 
de 
tc 
( | 
n 
( 
il 
i] 
VE 
Ce 
tl 
al 
81 
| 
Th 


CONVERGENCE THEOREMS FOR FUNCTIONS OF TWO COMPLEX VARIABLES. 691 


a, —a, and the theorem reduces to the ordinary one variable result, the 
convergence being supposed to depend only on the variable z,. 

Put in another form, Theorem 3 says that if a bounded function con- 
verges to zero for approach to infinity in the real plane in such fashion that 
to every « there corresponds a 8 and an a so that | f(2:,22)| << for every 
22} belonging to the set of the theorem, then | f (21, z2)| (wa fixed 
positive quantity less than 1) throughout the four dimensional domain (5). 


depending only on the parameters 6 and a. Thus, if lime = 0, convergence 
a->oo 


to zero uniformly in 3° implies convergence to zero in the domain (5) also. 
The pair of linear transformations 


(6) a(1 — (z, +1). 


.1+ 2 
map the quarter-space /mf,=0, Imé,=0 on the bicylinder | 2, | Sr, 
|z,|<1. The first transformation takes the points (%,—a,0) in the 
¢,-plane into the points (7, in the z-plane, so that the segment 
(— «©,—«a) goes into the are (r, re*?") ; here (a) is any suitable function 
of a for which ¢(a) > 0 asa— ax. The second transformation takes (0,1, ~ ) 
in the &-plane into (—1,—1,1) in the z-plane. Since such a mapping of 


the quarter-space on a bi-cylinder leaves the doubly harmonic measure in- 


variant (the Poisson integral is invariant under linear transformations), it 
can be applied to the domain employed in Theorem 3 to give a convergence 
theorem for a bicylinder. The first transformation takes Re(€,-+ a) into 


2a[ (1 + cos d)| 2: |? — sin 
| 1 + | 


| i , so that Theorem 3 becomes: 
Ze | 


and the second takes Pe é into 


THEOREM 3a. Given f(a,22) an a.f.2e.v. defined and regular in the 
<1, with | S1 on the distinguished 


closed bicylinder | | 


surface |z,|—=r, | z|=1. If | (0<e< 1) for 


2a8[ (1 + cos 2, |? — sin 
(7) (21,20) (8| (1 + co : | | 
| 1+ a)|2 |* 
Aye" | | | 


where 8 and a are arbitrary positive constants, then | f(2,%2)| <& for 


208 (z, — ) 1 +- \? 
{ 2) arg OSp=1], 
1— > pr, VS p> | 


692 WILLIAM F,. WHITMORE. 


where o(a) (0<¢< 2/2) is any suitable function of a which tends to zero 
as a tends to infinity. 


4, The Mi-domain and its properties. An M-domain is a four dimen- 
sional domain defined by 


(9) M E[z, <1, 0OSt<1, OSAS 
h(z2,0) = 2x) 


Its distinguished surface is E[z, = h(z2, A), | 22 
is subject to the following conditions: 


= 1]. The function h (22, 


(a) (4,2) is an analytic function of 2; 


(b) h(%,A) is a continuous function of A whose derivative with respect to A 
exists and is finite; 


(c) each curve z;=h(2.°,r) (2° = const., 0 AS 2) has a positive radius 
of curvature at all points, the limit inferior of all radii of curvature being 
positive, and is such that any sufficiently small arc a‘ lies entirely within a 
circle whose circumference cuts the curve only at the endpoints of a’, whose 
radius is not less than the distance between these end-points, and whose center 
lies in the interior of the curve.* 


It is possible to extend to $-domains a theorem on analytic continuation 
needed in what follows; the result was first established by Hartogs* in the 
case of product domains. 


LEMMA 1. Given an Mt-domain defined by (9), assume that M contains 
a product domain R? X E[ <1] simply connected). Let f (21, 2%) 
be a function satisfying the following conditions: 


(a) f(4%, 2) is an a.f. 2 c.v. in the interior of the product domain; and if 
z,° is any given interior point of R*, then f (21°, 22) is a continuous function 
of 2 on the circumference | 2. |=1; 

(b) if | | then f(a, te) is an analytic function of 2 for 2, = th(te, A) 
(OS¢t<1, 0S AS 2) and continuous on the boundary 2, = h(te, dA) ; 


(c) f(h(ts,A),te) is continuous on the distinguished surface of M; %.¢., 


8 Hypothesis (c) can also be phrased in terms of conditions as to the boundedness 
of the first and second derivatives of h(z,°,) with respect to A, in a similar manner 
to that used in a recent paper by Bergmann and Marcinkiewicz, Fundamenta Mathe- 
maticae, vol. 33 (1939), pp. 75-94; in particular, Lemma 3, p. 80. 

‘A statement of Hartogs’ theorem will be found in Osgood, Lehrbuch der Funk- 
tionentheorie, vol. II, part 1, p. 199. 


C 
of 

| 

re 

| ( 
B 

as 

th 
ve 
p 
ce 
i by 
ti 


CONVERGENCE THEOREMS FOR FUNCTIONS OF TWO COMPLEX VARIABLES. 693 


continuous in the variables X and ty (| te |—=1). Then f(z, 22) can be con- 


tinued analytically throughout M. 
Proof. For each interior point of the product domain, the Cauchy in- 
tegral formula for 1 c. v. applied to the variable z2 gives 
f(A, Z2) te) 


ts — Ze 


Moreover, for each z, the fune'wn f(z, can by a second application 
of the Cauchy formula be written as 


1 27 F(h (ts, A), t2) Oh (ts, r) a 
| h(to, r) ° 


to) 


Combining these two one has for all points of the product domain the integral 


representation 


= 1 te) A) 


‘ 
|ty|=1 


But this last expression is the generalized Cauchy integral for the domain We, 
as given by Bergmann (B;), and thus represents an a.f. 2 c.v. defined 
throughout Mt. Since the integral agrees with the original function in the 
product domain, it represents the analytic continuation of f(2,,22) over Wb. 


5. Convergence in M-domains. Because the theory of two complex 
variables possesses no analogue to the Riemann mapping theorem it is not 
possible to pass directly from results for the bicylinder to statements con- 
cerning Wt-domains. An indirect method of surmounting this difficulty is to 
make use of a domain of comparison (B,)—in this case, of a small bicylinder 
and to show that certain hypotheses as to con- 


contained in the 9t-domain 
vergence on the distinguished surface of the 9t-domain imply conditions as to 
convergence on the bicylinder to which Theorem 38a is applicable. It is known 
by Lemma 1 that any f. 2 ¢.v., analytic in the bicylinder, which satisfies 
certain hypotheses on the distinguished surface of the Mt-domain can be con- 
tinued analytically throughout the Mt-domain. The first step is to insure the 
existence of a bicylinder contained in QM, by the introduction of suitable 
normal codrdinates (B,). Such a system of codrdinates is given by the 
transformations 

(11) = +7, z*., == Zp, 

Oh (22, 2) 


694 WILLIAM F. WHITMORE. 


These take all points z; =h(z2,A,) into 2*,—=vr and all the inner normals 
to the curves h(z.°,r), taken at the point Ao, into the direction of the negative 
real axis. Thus by the use of these coérdinates it may without loss of gen- 
erality be assumed that for a particular value A = const., say A = 0, the value 
of h(z2,0) is independent of z. and that the inner normal to the curve h (z.°,d) 
(z,° = const.) has at the point A= 0 a direction independent of the value 
of 22°. Since by hypothesis (c) on the two dimensional sections z, = const. 
of the M-domain the boundaries of these sections all have radii of curvature 
greater than or equal to some positive number r, it may now be supposed that 
the Mt-domain contains a bicylinder | z; |r, | <1 which is tangent to 
the boundary of M along the two dimensional surface 2, = h(z2, 0). 

As a further preliminary, it is necessary to state some results for the one 
variable case. Given a two dimensional simply-connected domain @* whose 
boundary gq’ satisfies the conditions imposed on the boundaries of the sections 
z= const. of the Mt-domain. Then it is possible, given a sufficiently small 
are a of g', to describe a circle R? with center in &? whose circumference cuts 
q' only at the end-points of a’ and which has a radius not less than the distance 
between these end-points; so that if 6' denotes the are of the circumference of 
R? which is subtended by a‘ and lies outside @, then the central angle sub- 
tended by b' is not greater than 7/3. By Carleman’s extension principle 
(N, p. 63), the following inequalities are valid for all ze G?- R?: 


w(z, > wo(z, a’, KR?) = KR?) ; 


so that the set 6’, R?) > is contained in the set H[w(z, a’, G*) > 
(Since there is little chance of confusion, » is here used, as usual, to denote 
the harmonic measure specified by its argument). But the equipotential 
w(z, 61, R?) = p is known (N, p. 7) to be the circular arc interior to R? whose 
end-points are the same as those of 6' and which makes an angle (1—p)z 
with 6*. In particular, by reason of the above hypotheses, a semi-circle whose 
end-points coincide with the common end-points of at and 6! makes an angie 
not greater than 27/3 with b', so that for this case p= 1/3. Applying the 
one variable form of the two-constant theorem, we thus have the following 


result: 


LEMMA 2. (Given f(z) defined and regular in a domain @&? whose 
boundary g* has at all points a positive radius of curvature and is such that 
about any sufficiently small are a‘ of g' it is possible to describe a circle whose 
center is in &*, whose circumference cuts g' only at the end-points of a‘, and 
which has a radius not less than the distance between these end-points. Then 


if | f(z)| S1 on gt and | f(z)| (0<€<1) ona’, one has | f(z)| 


q 
q 


CONVERGENCE THEOREMS FOR FUNCTIONS OF TWO COMPLEX VARIABLES. 695 


at all points of the domain bounded by a* and the semi-circle whose end-points 


coincide with those of a’. 


Using this last result, it now becomes possible to pass from a majorant 
on the distinguished surface of an Yt-domain to a majorant on the dis- 
tinguished surface of the bicylinder to be used as a domain of comparison, 
and thus to use the result already obtained on convergence in the bicylinder 
to obtain a result on convergence in the Mt-domain. In what follows, the 
normal codrdinates introduced in equations (11) with AO will be used 
without further explicit mention of the fact. 

Still considering the one variable case, let g' be any curve 2, = h(z.°, A) 
and let z; = h(z.°,0) be one end-point of the are a’. Let a circle © of fixed 
radius 7, where 7 is the lower bound of the radii of curvature of g' (positive, 
by hypothesis). be drawn tangent to gq? at 2; = (22°, 0) and lying in G*. Then 
the semi-circle S* whose end-points coincide with those of a‘ cuts off an are 
on the circumference @' of ©? whose length is certainly greater than half the 
length of a’. provided the distance d between the end-points of a’ is not greater 
than 7. For let e' be the are cut off on c’, and let a and eé be the length of 
a’ and e? respectively. Obviously, the most unfavorable case is for d =r and 
for a’ coinciding with the tangent to gt at z,—=h(2°,0). In this case 
dSa< and = rd/4, so that aS 4e/3 and a fortiori a < 2¢. 

To summarize these results: By the hypotheses on the Mt-domain and by 
the use of the appropriate normal] codrdinates it is possible to assume without 
loss of generality that M contains a bicylinder | z, |S 7, | (ra fixed 
positive quantity) to which it is tangent along the two dimensional surface 
=h(22,0) =r, | z |S 1. Moreover, to any are a! = (h(22,A),7) (A> 0) 
on a section z, const., there corresponds an are (re%,7r) (6 > 0) cut off on 

z,|==1 by a semi-circle whose end-points coincide with those of a‘, the 


length of this latter are being greater than half the length of a’. Thus the set 


(12) h(22,A)—r |?(1 + Reh (22, A*)— 27 (Imh (22, A) ) (Imh (22, A*) ) | 
HL 


| 1+ h(a, A*)|? | h(z2,A) |? 


|4 


= 1, A* =const., A* => | h(22,A*) —r| > 2r| 


corresponds in this manner to a set on the distinguished surface of the bi- 
cylinder which includes the set (7) of Theorem 3a. Hence, using the results 
of Theorem 3a, it is now possible to state the following result for the case 


of an Pt-domain: 


696 WILLIAM F, WHITMORE. 


THEOREM 4. Given f(z, 22) defined and regular in an Mt-domain given 
by equation (9) and satisfying the hypotheses (a), (b), and (c) following 
equation (9). If | f(2,22)| <1 on the distinguished surface z, = h(z2.X), 
la, |—=1 of Mand | f(a, 2)| <« for {2,22} belonging to the set X? given 
by (12), then | f(z, 2)| <&/* for {2,22} belonging to the set (8) of 
Theorem 3a. 


This theorem has an interpretation for convergence wholly analogous to 
that previously given for Theorem 3. 


6. Extensions and applications. These results can be extended in at 
least two directions. First, the hypotheses on the set 3° of Theorem 4 can 
undoubtedly be made sharper if necessary, and the hypothesis that f(z,, 22) 
be regular in the entire Pt-domain can be lightened in accordance with Lemma 
1 to the supposition that the function is merely defined, bounded. and con- 
tinuous on the distinguished surface of M and regular in the intersection of M 
and a product domain which includes the particular two dimensional surface 
2; =h(2,0), | 2. | S 1 where convergence is to be studied. Second, reverting 
to Theorem 3, it is possible to use n overlapping parabolas and to suppose, for 
example, that | f(2,22)| <1 on the real plane y; = y2=0 and | f(%, 22) 
< me in the overlapping portions of the parabolas. The doubly harmonic 
measure of the resulting region is obtained merely by addition of the doubly 
harmonic measures of the individual parabolas. 

One of the interesting applications of the theory of functions of two 
complex variables is to the theory of pseudo-conformal mapping; i. e., mapping 
of a four dimensional domain by a pair of analytic functions of two complex 
variables. The results here stated apply only to a single function, but can 
easily be applied, in connection with some results of Bergmann (B;), to the 
study of convergence to the boundary of the pseudo-conformal map of a 
domain. I hope to discuss these matters further at a later time. 


UNIVERSITY OF CALIFORNIA. 
BERKELEY, CALIFORNIA. 


il 
d 
t 
e 
t 
D 
t 
i 
n 


ON THE NON-EXISTENCE OF THE EUCLIDEAN ALGORITHM 
IN CERTAIN QUADRATIC NUMBER FIELDS.* 


By ALFRED BRAUER. 


Introduction. [Let P be the field of rational numbers, and m be a rational 
integer which is not divisible by the square of a prime. If for any pair of 
integers #, B with BAO of P(m*%) a third integer y of this field can be 
determined such that 


(1) | W(a— By)| < | N(B)], 


where V(a) is the norm of 2, we say’ that the Euclidean algorithm exists in 
the field P(m™”) or that the field is Euclidean. 

The problem of determining all Euclidean quadratic fields has not been 
solved completely, although this question has been studied a great deal during 
the last few years. In this paper I prove that the algorithm does not exist 
in certain cases in which the question has remained unsolved till now. 

If a field is Euclidean, then the greatest common divisor exists for any 
pairs of integers of this field; thus it is necessary that the class number is 
equal to 1. Dedekind * remarked that this condition is not sufficient, because 
the class number is equal to 1 in the field P(V—19), although this field is 
not Kuclidean. 

For imaginary quadratic fields it was shown by L. E. Dickson * that the 
“uclidean algorithm exists only in the cases m = — 1, — 2, — 3, —7, and 
—11. For real quadratic fields the question has not yet been solved com- 
pletely. Tor 
(2) m = 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 


the algorithm exists. This follows from the investigations of O. Perron,* 
A, Oppenheim,’ R. Remak,® E. Berg,’ and N. Hofreiter.* I. Schur® remarked 

* Received October 13, 1939. 

1Cf. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 
Oxford (1938), pp. 212-217. 

*P. G, Lejeune Dirichlet, Vorlesungen iiber Zahlentheorie, herausgegeben von R. 
Dedekind, 4. Aufl, Braunschweig (1894), p. 451. 

* Algebren und ihre Zahlentheorie, Ziirich u. Leipzig (1927), pp. 150-151. 

*“Quadratische Zahlkérper mit Euklidischem Algorithmus,” Mathematische An- 
nalen, vol. 107 (1932), pp. 489-495. 

>“ Quadratic fields with and without Euclid’s algorithm,” Mathematische Annalen, 
vol. 109 (1934), pp. 349-352. 

®“ isher den Euklidischen Algorithmus in reell-quadratischen Zahlkérpern,” Jahres- 
bericht d. Deutschen Mathematiker-Vereinigung, vol. 44 (1934), pp. 238-250. 
697 


698 ALFRED BRAUER. 


that the algorithm does not exist for m = 4%. A. Oppenheim? proved the 
non-existence for m = 23 and m = 53, N. Hofreiter ** for m = 14 (mod 24), 
and also ** for m = 77 and for m = 21 (mod 24) with m > 21, E. Berg ** and 
J. Fox Keston *** for m #1 (mod 4) except in the cases (2). Some of these 
results are also proved in Hardy and Wright’s book mentioned above in the 
footnote 1. H. Behrbohm and L. Rédei** showed that, excepting the cases 
(2), the algorithm can only exist in the following three cases (p and q denote 
primes ) 
I. m= p= 13 (mod 24), 
II. m=p=1 (mod 8), 
Ill. m=pq with p=q=3 (mod8) or p=q=7 (mod 8). 


Using analytical methods, P. Erdés and Ch. Ko?® proved that the algorithm 
does not exist in the cases I and II, if m is sufficiently large. The corre- 
sponding fact in the case III was shown by H. Heilbronn.’® Finally, L. 
Schuster *” proved that in the case III, the algorithm exists at most for 
m ==1 (mod 24) except for m = 33 and m = 57. 

In this paper I improve the theorem of Erdos and Ko for the case I. 
I show by elementary methods that here the algorithm cannot exist for 
p> 109. In the cases and p= 37 the algorithm exists; whether or 
not the fields P(/61) and P(V109) are Euclidean, I cannot decide. 

In their paper mentioned above, Erdés and Ko prove that the algorithm 
does not exist in the case I, if the two least quadratic non-residues wu and v 
which are odd primes satisfy the condition 


7“ ber die Existenz eines Euklidischen Algorithmus in Zahl- 
kérpern,” Kungl. Fysiografiska Sdllskapets i Lund Foérhandlingar, vol. 5 (1935), Nr. 5. 

8“ Quadratische Kérper mit und ohne Euklidischen Algorithmus,” Seasiaitie fiir 
Mathematik und Physik, vol. 42 (1935), pp. 397-400. 

® Cf. loc. cit. 5), p. 351. 

10 Loc. cit. 5). 

11“ Quadratische Zahlkérper ohne Euklidischen Algorithmus,” Mathematische An- 
nalen, vol. 110 (1935), pp. 195-196. 

12 Toc. cit. 8). 

18 Toc. cit. 7). 

18a “ Existence of a Euclidean algorithm in quadratic fields,” 7hesis Yale University 
(1935); ef. Bulletin of the American Mathematical Society, vol. 41 (1935), p. 186. 

44 Der Euklidische Algorithmus in quadratischen Zahlkérpern,” Journal f. 4. 
reine u. angewandte Mathematik, vol. 174 (1936), pp. 192-205. 

18 Note on the Euclidean algorithm,” Journal of the London Mathematical 
Society, vol. 13 (1938), pp. 3-8. 

16 My» Euclid’s algorithm in real quadratic fields,” Proceedings of the Cambridge 
Philosophical Society, vol. 34 (1938), pp. 521-526. 

17“ Reellquadratische Zahlkérper ohne Euklidischen Algorithmus,” Monatshefte f. 
Mathematik u. Physik, vol. 47 (1938), pp. 117-127. 


T 

p 

0 

le 

W 

il 

al 

W 

n 

0 

1 

W 

€ 

| 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 699 


(3) < p. 


Then they use analytical methods for proving that (3) is satisfied for all 
primes which are sufficiently large. I here prove (3) for all primes p > 421 
of the form 24n + 13 in an elementary manner, using the inequality for the 
least odd quadratic non-residue wu modulo a prime p of the form 8n + 5, 
(4) (4p)? + (4p)} +1, 
which I had obtained by elementary methods in a former paper.'* 

After dealing briefly with the method of Erdés and Ko in §1, I prove 
in § 2 that the algorithm does not exist in P(p”), if p is of the form 24n + 13 
and v > 8u. If, however, v < 8u, then the non-existence of the algorithm 
for sufficiently large p follows immediately from (4). The limit for p for 
which this holds can easily be given. In order to replace it by a smaller 
number, I show in §3 that the Euclidean algorithm does not exist, if 
p=24n + 13 > 12696 and v >6u. From these theorems, the non-existence 
of the algorithm follows for the primes of this type, if 24u? < p, or if 
< p and p> 12696. 

In § 4, I improve (4) to 

U < 2°/5p2/5 4 2-6/5). 4. 3 for p= 8n + 5, 
< + 4% + for p= 8n + 3. 

For p= 8n + 5 this is still further improved. On the basis of these theorems 
we obtain the result in §5 that the algorithm does not exist in P(p%) for 
p= 24n + 13 > 3300000. The primes below this limit must be treated 
directly. Using the above theorems we can see that the algorithm does not 
exist for p > 109. This direct treatment requires long computation, even if 
properly arranged. In these computations I have been assisted by my mother 
and my wife. 

1, The method of Erdés and Ko. I[rddés and Ko prove the following 
theorems : 


THEOREM 1. Fora prime p of the form 4n-+1, the Euclidean algo- 
rithm cannot exist in P(p), if p can be written in the form 


(5) P q2Ma, 


where m,, M2, 41, 92 are all positive and quadratic non-residues (mod p), and 
where the qi are odd primes which divide qim; to an odd power for i=—=1,2. 


Proof. We write the condition (1) in the form 


18 ijber den kleinsten quadratischen Nichtrest,” Mathematische Zeitschrift, vol. 33 
(1930), pp. 161-176. 


700 ALFRED BRAUER. 
(6) | <1. 
Suppose now 
(7) 
y=3(t+yp*), 


where r,s are rational and :r,y rational integers with e=y (mod 2). This 
is possible, since y is any integer of the field P(p*) and p=1 (mod 4). From 
(6) and (7) we obtain 

(8) | 2r)? — p(y—2s)?| <4. 

Consequently, if for a pair of rational numbers r and s it is impossible 
to determine the rational integers « and y such that the condition (8) is 
satisfied, the field is not Euclidean. Since q,m, is a quadratic residue, the 
congruence 2° == 4q,m, (mod p) has a solution 2. 

We now choose 

r= (), $= 


and it follows from (8) that 
| pu” — (py — 2a)? | < 4p. 


Since here the left-hand side is congruent to 42,7 (mod 4p), we have either 


(9) px? —— (py — = —4qim,, 
or, by (5), 
(10) — (py — = 4p — 4qrm, = 4q2m2. 


We have to show that (9) is not possible. Suppose first that c= 0 (mod q;). 
Then also py — 2z,=0 (modq,) and q, divides the left-hand side of (9) 
to an even power, but the right-hand side to an odd power. This is impossi- 
ble. Suppose now «#0 (modq,). Then it follows from (9) that pz? is a 
quadratic residue of g;. This is impossible, because 
= =—1. 

Thus (9) is impossible. In the same way it follows that (10) cannot be 
solvable. 

THEOREM 2. Let p be a prime of tie form 24n+ 13. If u and v are 
the two least quadratic non-residues (mod p), which are odd primes, and tf 
(11) 3uv < p, 
then the Euclidean algorithm does not exist in P(p®). 


Proof. If we set 
(12) p = 3uv + 2b, 
and 
(13) p= w-+ 


tl 

t 

f 
C 

( 

is 

B 
T 

4 
5 
i 
I 

H 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 701 


then 6, and 6, are positive integers, because of (11). For primes p= 24n + 13, 
the number 2 is a quadratic non-residue and 3 a quadratic residue. Hence, 
from (12) it follows that 
(2b:/p) = (— 3uv/p) = (uv/p) = 1. 
Consequently, b; is a non-residue. In an analogous manner it follows from 
(13) that b. is a non-residue. Further, from (12) and (13), we have 
p = 3b.— 

Therefore, one of the two numbers },, b. is odd. Let us first assume that ), 
is odd. Since b; was a quadratic non-residue, there exists at least one odd 


prime q which is a non-residue (mod p) and which divides b, to an odd power. 
Because of (12), the number q is different from u and vr. We set in Theorem 1 


qi =U, Mm, = 30, G2 = q. Ms = 2b,/q. 
Then (12) yields a representation of p which satisfies the conditions of 
Theorem 1. Therefore the algorithm does not exist in P(p) in this case. 

If, however, b. is odd, then there exists at least one odd prime q’ which 
is a quadratic non-residue (mod p) and which divides b. to an odd power. 
From Theorem 1 for 

Gi = U, =v, = 2b2/q’, 
and from (13) it then follows that the algorithm does not exist in P(p) in 
this case. 

The Theorems 1 and 2 will be used in the following. The proofs are 
given here again, in the first place in order to show that they are elementary. 
in the second because the Theorem 2 is not given explicitly in the paper of 
Erdés and Ko. There it is assumed instead of (11) that the three least 
quadratic non-residues u,v, w (mod p) which are odd primes satisfy the 
condition 

uvw < pr 
where 7 <-001 is a positive constant. But for the primes of the form 
24n + 13, Erdés and Ko actually use only the weaker assumption (11). In 
my paper it will be important that the condition (11) is sufficient in this case, 

2. Elementary proof of the theorem for large p. We first prove the 
following theorem : 

THEOREM 3. Let p be a prime of the form 24n 4-13. If the two least 
quadratic non-residues u,v (mod p), which are odd primes, satisfy the 
condition 


(14) v > 8u, 


then there does not exist an Euclidean algorithm in P(p®). 


702 ALFRED BRAUER. 


Proof. We have nothing to prove for p= 13, 37, and 61, since (14) 
is not true for these primes. We assume therefore p= 109. 

The least odd quadratic non-residue u modulo a prime p of the form 
8n + 5 satisfies the condition 


(15) u< (p+ 2, 

as I have shown in the paper mentioned above.’® Hence, because of p > 96 
(16) p—8u > p—8(p+ 4)4—16 > 0, 

since 


(p — 16)? = p* — 32p + 256 > 64p + 256. 
Let 2ku be the largest multiple of 2u which is less than p. The following 
eight even numbers 


(17) 2(k —3)u, 2(k—2)u, 2(k—1)u, 2(k-+1)u, 
2(k + 2)u, 2(k + 3)u, 2(k+ 4)u 
lie in the interval {p— 8u- - -p-—+ 8u} and are therefore positive because 


of (16). They form an arithmetical progression with the difference 2u. It 
follows that exactly two of these numbers are divisible by 4 and not by 8; 
the difference of these two numbers is 8u. This implies that at least one of 
them, say 4lu, is not divisible by u?. Then 


(18) (1, 1. 

Furthermore, since 4lu is one of the numbers (17), we have 
(19) p— 8u < 4lu < p+ 8u, 

(20) | p—4lu| < 8u. 


On the other hand, | p—4lw| is an odd integer less than 8u because 

of (20), hence less than v because of (14). It follows that | p—4lw| is a 
quadratic residue (mod p), since | p— 4lu| is not divisible by uw, and all the 
odd positive integers less than v, which are not divisible by u, are quadratic 
residues. Then Jw also is a quadratic residue, and therefore / a quadratic 
non-residue (mod p). Because of (18), / contains at least one odd prime w 
which is different from wu and a quadratic non-residue (mod p). Consequently, 
v = w, hence because of (19) 

4uv S 4uw S 4lu < p+ 8u, 

3uv + uw < p+ 8u, 

3uv + u(v—8) <p. 
Thus, because of (14) 

3uv < p. 


19 Loc. cit. 18), Satz 2. 


i 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 703 


Theorem 2 now shows that the Euclidean algorithm cannot exist in P(p%). 
This proves Theorem 3. 

If, on the other hand, the assumption (14) is not satisfied, i. e., if v < 8u, 
then (4) implies the inequality 


(21) Buv < 24u? < 24{2 (4p)? + 2(4p)* + 1}2. 
However, if p is sufficiently large, we have 

(22) + 2(4p)5 4 1}? 

For all values of p for which (22) holds we have because of (21) 


duv < p. 
According to Theorem 2 the Euclidean algorithm does not exist in P(p*) for 
these p in the case v < 8u we are considering. In connection with Theorem 3 
this yields the theorem of Erdés and Ko. 


THEOREM 4. If p is a sufficiently large prime of the form 24n + 13, 
then there does not exist an Euclidean algorithm in P(p*). 

Since (4) had been obtained by elementary methods, we have given a 
proof which is free of analytical methods. More exactly, we see that the 
algorithm does not exist, when (22) is satisfied. We may easily obtain a lower 
bound for p from which (22) holds. We do not give the computation, since 
we shall later obtain a still smaller value of this lower -bound. 

As an immediate consequence of Theorem 3 we have 


THEOREM 5. If the least odd quadratic non-residue u modolu a prime 
p of the form 24n + 13 satisfies the condition 24u? < p, then there does not 
exist an Euclidean algorithm in P(p). 

Proof. Let again u and v be the least quadratic non-residues (mod p) 
which are odd primes, u < v. If v > 8u, the statement follows from Theorem 3. 
If, however, v < 8u, then 

3uv < 24u? < p 


and the theorem follows from Theorem 2. 
3. Improvement of Theorem 3. The Theorem 3 can be improved in 
the following manner: 


TuHEorEM 6. Jf p > 12696 is a prime of the form 24n + 13, and if 
the two least quadratic non-residues u and v (mod p) which are odd primes 
satisfy the condition 
(23) v > 6u, 


then there does not exist an Euclidean algorithm in P(p). 


704 ALFRED BRAUER. 


Proof. If wS 23, then 24u? = 12696, and the statement follows from 
Theorem 5. We assume therefore that 

(24) u = 29. 

As in the proof of Theorem 3, let 2ku be the greatest integral multiple of 2u 
less than p. We take here the following four integers 

(25) 2(k—1)u, ku, 2%k+1)u, %Ak+2)u 

which belong to (17). These integers lie in the interval {p—4u--- p+ 4u}. 
They are all positive because of (16). since p > 12696 > 96. There is exactly 
one of them which is divisible by 4. but not by 8. Suppose that this is 
the number 4/u. Analogously to (19) and (20), we obtain from (25) 


(26) p—4u< 4lu << p+ 
27) | p—4lu| < 4u< 6u. 


Further, / is odd. If we have 
(28) (1,u) 


in accordance with (18), then the statement follows from (23), (26), (27), 
and (28) in complete analogy with the proof of Theorem 3. 
On the other hand, let us suppose that 


> 1. 
Then we have 

(1,u) = u, 
since wu is a prime; hence 
(29) 4lu == 0 (mod u*). 


From (15) it follows that 
(30) p— 24u > p— 2A(p + 4)4— 48 > 0 
since p > 672, and therefore 
(p — 48)? = p? — 96p + 2304 > 576p + 2304. 
We consider the interval 
(31) I = {p— 24u- - - p}. 
There lie exactly 4 odd multiples of 3u in J. Let 
(32)  3(s+2)u, 3(s+4)u, 3(s+6)u 
be these multiples. The even integers 
(33) p—3(s+6)u, p—3(s+4)u, p—3(s+2)u, p—d3su 


form an arithmetical progression with the difference 6u. It follows that 


| 
( 
j 
h 
ig 
d 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 705 


exactly one of the numbers (33) is divisible by 4 and not by 8. Suppose 
that this is p— 3/u; then we have 
(34) 4(p— 3tu) =1 (mod 2). 


The integer 3/u belongs to the numbers (32), hence we have 


(35) == 1 (mod 2 
and 
(36) 0< p— 24u < 3tu < p 


according to (31) and (30). 

Moreover 3fu and 4/u both lie in the interval {p—24u---p+4u} 
because of (26) and (36). Their difference then is at most equal to 28u. 
Because of (24), we have 


(37) | 3lu — 4lu | S 28u < wu’. 


According to (35), 3lw is odd, and therefore different from 4/u. It follows 
from (37) that 3fw and 4lu are not both divisible by u?. Hence we obtain 


{38) (mod u*) 
by (29). Furthermore, we have 
(39) 0<4(p—3lu) <bu<v 


because of (36) and (23). 

The integer }(p— 3tu) is odd, not divisible by uw, and less than v because 
of (34) and (39). Consequently, it is a quadratic residue (mod p) ; so is 3/u. 
It follows that 34 and / are quadratic non-residues. But ¢ was odd because 
of (35), positive because of (36), and not divisible by uw, according to (38). 
Then ¢ contains at least one odd prime factor w which is a quadratic non- 
residue (mod p) but different from u. For it, we have v = w, and therefore, 
because of (36) 

3uv S 3uw S 3ut < p. 


The statement of Theorem 6 follows now from Theorem 2. 
From Theorem 6 we obtain at once the following theorem which improves 
Theorem 4: 


THeoreM 7. Jf p> 12696 is a prime of the form 24n + 13 and if the 
least quadratic non-residue u (mod p), which is an odd prime, satisfies the 
condition 18u2 < p, then there does not exist an Huclidean algorithm in P(p®). 


4. Estimates for the least odd quadratic non-residue. In my paper 
mentioned in the introduction, I have shown that the least odd quadratic 


non-residue uw for a prime of the form 8n + 5 satisfies the inequality 


706 ALFRED BRAUER. 


(40) < 2{(4p)*/ + (4p)"} 4-1. 


It was mentioned there that this relation can still be improved for primes of 
the form 8n + 5. In this manner, we may obtain 


(41) 2{ (2p)? + +1. 
We now have to improve (40) and (41) still further. 


THEOREM 8. The least odd quadratic non-residue u modulo a prime p 
satisfies 


(42) § < 29/5p?/5 4 2-6/5) 25/5 3 for p= 8n + 5, 


2p + 4% for p=8n+3. 


Proof. We have nothing to prove for u==3 and u=5; hence we assume 
u=%. The even numbers 


are quadratic residues. For a p of the form 8n + 5, the numbers 


are also quadratic residues. Let U denote the interval {p- - -p+u—1} 
if p is of the form 8n-+ 3, and the interval {p —u+1---p+u—}l} 
for p of the form 8n + 5. Then al] even integers of U are quadratic residues. 
If z is an arbitrary odd integer such that 
(43) flSz<u/2 tor p=8n-+ 3, 

1<z<u/2 for p=8n-+ 5, 


then U contains integral multiples of 2z. Let 
(44) 22, (k + 1)2z,---,(k+1— 1)2z 


be those multiples of 2z; then 
(45) k< [ Py] +1. 


All the numbers (44) are quadratic residues as even numbers of U, 2 is 4 
non-residue, and z a quadratic residue because of (43). This implies that 


the numbers 
(46) 


form a sequence of J non-residues. None of them is therefore a square. Hence 
we may find a positive integer a such that 


(47) (a+1)? 


N 
T 
| | 
te 
(: 
( 
(, 
\ 
B 
( 
E 
| 
| 
| 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 707 


Then it follows from (45) that 
(48) 


We divide now the interval A = {a?- - - (a+ 1)?} into parts; we have 
to distinguish between two cases. 

I. a even: 
We determine the positive integer t’ such that 
(49) (a +1)2?— (2U’)2? >a > (a+ 1)? — (2 + 


This can always be done since an even square number cannot equal the 
difference of an odd and an even square number and sinceea=2. Then 


(2t’)? < 2a +1, 
[Vea]. 


(50) 
The points 
(51) (a + 1)? — (2v)? (v==1,2,: 


divide the interval .1 into subintervals. The distance between two such 
consecutive points is 


(52) (a-+1)?— (2v)?—[(a + 1)?— (27 + 2)?7] = + 4, 
The largest of the subintervals of A is therefore either the interval 
Iv = {(a +1)? — (2 —2)?- (a + 1)?— (2¢)}, 
or the interval 
= {(a-+1)? — (20)? 
Because of (52), we find for the length | J+ | of It 
(53) | | —4. 
Since a was even, we have 
a* —[(a + — + 2)*] =3 (mod 4). 
Because of (49), we find then 
a* [(a +1)? — (2¢’ + 2)?] 23. 
For the length | Ji. | of Ius1 we have therefore 
(54) | |S 8 +4—3—8/ +1 


because of (52). If s’ now denotes the maximal length of a subinterval of A, 
we obtain from (53), (54), and (50) 


% 
(2) 


TO8 ALFRED BRAUER. 


(55) f = 8 +15 4[V 2a] +1. 
Il. «odd: 
In this case we determine an integer ¢” = 0 such that 


(56) (a + 1)?— +1)? >a? > (a+ 1)?— + 3)2. 


This again is possible since the sum of two odd squares can not be a square. 


Then 

[V2a]. 

The points 

(58) (a + 1)?— (2v+1)? (v=0,1,2,---, 0”) 


again divide the interval A into parts, and the distance of two consecutive 


points (58) is given by 
(59) (a +1)?— +1)?— ((a +1)? — + 3)?) = +8. 
Consequently, if ¢” > 0 then either the interval 
Ty = {(a + 1)? — — (a +1)? — (2 1)?} 
or the interval 
= ((a +1)? — (207 


is the largest of the subintervals of A. If ¢”=0 then J¢",, is the largest 


subinterval of A. If again | J: | and | J¢4, | denote the lengths of J+” and 
Tt"4:, then because of (59) 
(60) | | = 

Since a was odd, we have 

— ((a + 1)*— + 3)*) =2 (mod 4) ; 
hence because of (56) 
a? — ((a + 1)?— (2¢” + 3)?) =2 

and because of (59) 
(61) | Teva | S80” + 8 —2 = Bt" + 6. 
Let s” denote the maximal length of the subintervals of A. From (60), (61), 
and (57), it follows that 
(62) = 8t” +6 S4[ V 2a] +2. 

We set now 
(63) = Max (s’, 8”). 
Ini both cases J and II, the length of each subinterval of A is at most equal to s 
and we have, because of (55) and (62) 


N 
( 
( 
Ww 
th 
] 
( 
fo 
f 
( 
fo 
le 
It 
re 
hh 
se 
re 
| 
p 
: 
U 
he 
be 
4 It 
k, 
wl 
a 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 709 


(64) sS4[V 2a] + 2. 
If we had now 
(65) wu = Max {a+ 2+ [V 2a], ezs}, 
where 
for p= 8n+ 5, 
(e=2 for p=8n-+ 3, 


then all the odd integers Sa+1-+[V2a] would be quadratic residues. 
In the case that a is even, the odd integers 


(67) (a + 1)?— (2)? = (a +14 2v)(a+1—2y) 


for v= 1,2,---,t’ would all be quadratic residues because of (50). Similarly, 
for odd a, the odd numbers 


(68) (a+1)*?— 
for v= 0,1,- - -, 7?” would all be quadratic residues because of (57). 

In both cases, we have divided the interval A by the points (51) and (58) 
respectively which are quadratic residues according to (67) and (68). The 
length of each subinterval of A was smaller than or equal to s because of (63). 
It follows that each interval of length s which lies in A contains a quadratic 
residue, and hence A can not contain a sequence of s non-residues. 

On the other hand, we have because of (65) 


(69) es. 

In the case of a prime p of the form 8n + 3, the interval U contained u con- 
secutive integers, and then, according to (69) and (66) at least s complete 
residue systems mod 2z and hence at least s multiples of 2z. For p=8n-+ 5, 


there appear at least s multiples of z among the wu numbers p,p-+1,---, 
p+u—1 because of (69) and (66). The same is true for the w numbers 
pPp—1,--:,p—u+1. But here, we have z>1, according to (43), 


and this shows that z does not divide p. Thus, we have also in 
U={p—u-+1--:-p+u—1} at least 2s consecutive multiples of z, 
hence s multiples of 2z. 

On the other hand, the number of multiples of 2z in U was equal to 1 
because of (44), hence 

It follows from (46) and (4%) that the interval A contains a sequence 
kk +1,---,4+1—1 of at least s non-residues. This gives a contradiction 


which shows that (65) is not true. Hence 
u< Max {fa+2+[V 2a], es} 


and then according to (64) 


710 ALFRED BRAUER. 


u< Max {a+ [V2a], V2a] + 2)}, 
u S Max {a+ 1+ [V2a], V2a] + 2) —1}. 


Because of (48), we have then 


u< Max} (2) +2 (4) +1,4(4 2 + 2 


% % 
(70) u< Max} (2) + (22) +1, + 


For p of the form 8n + 5, we assume first that p > 2048. For p of the 
form 8n + 3, no restriction is necessary. 
If we assume that 


(71) U 23/5 62/5 2/5 29/5 6/5 (3 1 


and if we determine the odd integer zo so that 


1 
+2>0>2(4 
then it follows from (71) and (72) that Ju > 2. For p=—8n-+ 5, we have 


z > 1 because of (66), since p > 2048. For z= the conditions (43) 
are satisfied. Then it follows from (70) and (72) that 


~(1/5) 4 -(1/5) 
u< Max 2«p (4) + + 1, 
4 \1/5 3 1/5 
de +2) +3(#) + 4e—1 


3/5 
(73) u< Max 29/5 92/5 + 24/5_1/51/5 + 1 (4) 


326° 


3 pe 2/5 6 pe 1/5 ) 


We state now that 


(2767/5) (8/5) 98/20 +. 3- 
This can be shown as follows. The first two terms of both sides are equal. 
Furthermore 
(4/5) 1/5 <3: 24/5_-(4/5) 1/5 ¥, am Q- (11/5) (4/5) 91/5 
and 
16 < 81 Perr”. 


Consequently (74) holds. From (73) and (74), we obtain 


NC 


pr 


H 
T 
ac 
th 
It 
{ 

B 

wl 
al 
{i 
A 

(7 
I f 
(3 
we 
(7 
(8 
Fc 
We 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 711 


u < Max {23/5 7/5 92/5 + 24/5 
dep (2-(7/5) (8/5) 93/20 +3: Q- (1/5) 79-(1/20) ) 4. Q- (6/5) 91/5 1}, 
u< Max {23/5 2/5 24/5 1/5 
23/5 2/5 2/5 + 99/5 . + 16/8) 41/6 + 1}. 
Hence 
U < 23/5 9/5 + 29/5 6/5 (3 + + 
This gives a contradiction to (71), showing that (71) can not hold. Hence, 
according to (66) 
U < 27/9 4. 2-6/5) «25+ + 3 for p= 8n+ 5 > 2048, 
U< + 4% pe + 7 for p= 8n + 3. 
This proves the statement (42) for all primes of the form 8n +3 and for 
primes of the form 87 -++ 5 which exceed 2048. 
It remains to prove (42) for primes of the form 8n -+ 5 which are less 
than 2048. For these primes, we have, according to (15) 
U<Vp+e+2. 
It is therefore sufficient to show that for p < 2048 we have 
{75) V peat 2 < 23/5 4 2-6/5) 2Q5yV/5 + 3. 
But this follows from 
p+ (23/5 92/5 + Q-(6/5) . 29/5 4/5 + 92/5. 25 + Q-(12/5) . 625 


which is true, since p < 2048 and therefore p < 25p*/*. Hence (75) holds 
and Theorem 8 is proved. 
It will be necessary to improve Theorem 8 still further. We have 
THEOREM 9, Let p be a prime of the form 8n + 5 which satisfies 
(76) p > 10°, 
Assume that the two least quadratic non-residues u,v which are odd primes 
satisfy the condition 


(77) u<u< bu. 

If we set 

(78) p=v/u +1, 

we have 

(79) u < 2(p/p)*/* + (24/p + 4) (p/p)? + 9/p. 
Proof. From (77) and (78) we obtain 

(80) 7. 


For u < 16, the statement (79) is certainly true because of (76). For u > 16, 
we have 


712 ALFRED BRAUER. 


u— (2p—2) >4 


according to (80). Let z be an arbitrary odd integer which satisfies 


(81) 2p—2S22%< 4. 

We consider the numbers of the form p+ hvu with integral hy = 0 which 
lie in the interval {p—v+1,---,p+v—1}. We have here hu <1, 
and according to (78) 

(82) 1. 


At most one of the numbers p+ hvu is divisible by 2z. Indeed, if two of 
these numbers were divisible by 2z, so would their difference, which is of the 
form h*u with | h* | < 2p—2, because of (82). But w is a prime and 
therefore prime to 2z, according to (81). Hence h* would be divisible by 2z. 
This, however, is impossible because (81) implies 

| h* | < 2p—2 5 2z. 


Consequently, either all the numbers p + hvu in the interval {p—v-+1---p}, 
or all those numbers in the interval {p--- + v—1} are not divisible by 2z. 
Let us assume that {p---p-—+uv-—1} does not contain a number p + hvu 
which is divisible by 2z. In the other case we may argue in exactly the same 
manner. 

Let V be the interval {p—u-+1---p-+v—1}, and suppose that 
(83) k- 22, (k 
are the multiples of 2z in V. Since it has beeen assumed that the interval 
{p--+p-+uv—1} does not contain a multiple of 2z of the form p= hvu, 
and since the interval {p —u-+1- - -p—1} does not contain a number of 
the form p+ hyu, none of the numbers (83) is of the form p+ hvu with 
integral hy. However, all the even numbers of V, except the numbers 
p + hyu, are quadratic residues (mod p). Since 2 is a non-residue and z is a 
residue, according to (81), the numbers 
(84) 
form a sequence of | quadratic non-residues, analogously to (46). From (83) 


it follows that (45) again holds. 


Let a, A, and s have the same significance as in the proof of Theorem 6. 


We see in the same manner as above, that (48) and (64) hold. Then 


(85) 


(86) s<4[V20]+2. 
If now 


+1 | 
(87) u = Max at2+ 2a], Zs 


then we could show as above that A cannot contain a sequence of s non-residues. 


ot 
at 


be 
of 
to 


Sir 
of 


(88 
(89 
We 
(90 
Bec 
(91 
sinc 
(92 
sine 
(93 
But 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 713 
Because of (83), V contains exactly / integral multiples of 2z. On the 
other hand, V contains exactly u-++- v—1 consecutive integers, and therefore 


at least =| multiples of 2z. Consequently, we have 


[~—]= 


because of (78) and (87), and there lies a sequence k,k + 1,---,k+1—1 

of at least s non-residues in A, according to (84). This gives a contradiction 

to the result above, and hence (87) is not true. We have, therefore 
u<Max}a+24 

Since the first expression on the right-hand side is an integer, we see, because 

of (86) and (85), that 


Max} al, 


< Max} { V2 1, 


p 
( fo, “2p 8 , 4e+1 
(88) tax} VE + 1,2 + 
Let us assume that 


(89) u=2 (2)" + (=+ (2)" 
p \p p 


We then determine the odd integer z) such that 


? 


22(4[ V2a] +2) +1 
p 


Because of (80), we have p< From (76) it follows that 


911. 55 915. 55 2% [p—1]° 
91 = = 
(«—1)5 


since is increasing with fora = 1. From (90) and (91) we obtain 


(92) [p] > p—l, 


since Zz) was an integer 
Furthermore, we have to show that 


(93) <u. 


But this is true, since 


2 


714 ALFRED BRAUER. 


1/5 1/5 1/5 
20 < +4<2(2) +1<2(2) 
4° \p p p 


2/5 9 1/5 ( 
+42 <2(2) +t) (2) 
pp p p 2/\p p 


because of (90), (80), (76), and (89). 


According to (92) and (93) the conditions (81) are satisfied for z = z,. 


Hence it follows from (88) and (90) that 


8 
p 


We now state that 


12/5 3/5 3 8/5 2/5 4/5 1/5 12/5 73/5 3 8/5 2/5 

This follows immediately from 3 < 27/8 and 16 < 2%, since the first two 

terms on either side of the inequality are equal. We obtain from (94) and (95) 


u < Max 2 (2) "+ 2 (2)""4 1, 
p p 


8 3/5 ) 9 ) 
3/53/20 41. 3 /-(1/5) (1/20) ) P 


Since p< 7, each term of the second expression on the right side is not 
smaller than the corresponding term of the first expression. Hence 


p p \p p 


This gives a contradiction to (89) which cannot then be true, proving (79). 


5. Proof of the non-existence of the algorithm for p > 3 300000. 
Based on the results of the preceding section we shall now show that there 
does not exist an Euclidean algorithm in P(p%), if p is a prime of the form 


'24n + 13, and p > 3 300 000. 
According to Theorem 6, it is sufficient to assume v < 6u, and hence 


2<p<7 because of (80). Theorem 9 gives then the inequality (79) for u. 
According to Theorem 2, the algorithm certainly does not exist, if 


(96) Buv = 3(p—1l)w<p 
or because of (79) 


— mi/4 12/5 3/5 8/5 92/5 +- 3 4/5 + 16 i 


4n l6p 
(94) mex | (Sapa) + 
| 
tl 


NON-EXISTENCE OF EUCLIDEAN ALGORITHM IN QUADRATIC NUMBER FIELDS. 715 


p p p 


We now state that the function of p 


n\2/5 
(98) (0) 
p 


is increasing in the interval 2< p= 7, if p> 3300000 is fixed. Since 
Y% (p/p)*/"(p —1)?”? is increasing for p > 2, it is sufficient to show that 


2/5 Z 1/5 ) 


is increasing for 2< p17, if p > 3300000 is fixed. Differentiating with 
regard to p for a fixed p and setting p'/° = y, we obtain 


(99) (p) = y’p*” (2p + 8) + y(288 — 168p)+ 45p/*(2—p). 
We have to distinguish between two cases 


6. 
Because of p > 3 300000 we have y > 20 and therefore from (99) 


(100) (p) > y(40p + 160 + 288 — 168p) + 45-34 (2—p) 
> y(448 — 128p) + 90(2 —p) > 64y— 90 > 0. 
2) 


From (99) we obtain 


(101) > 3*/*y?(2p + 8) + y(288 — 168p) 
+ 45+ p) > 2.2y?(2p + 8) + y(288 — 168p)— 450 
= y(88p + 352 + 288 — 168p)— 450 > 80y—450 > 0. 
Because of (100) and (101) we have w’(p) >0 for 2< p71, hence 
¥(p) and ¢(p) are increasing in this interval, if p > 3 300000 is fixed. It 
then follows from (98) for 2< p= 7 that 


4/5 2 2/5 
(102) < 72 (2) + 36) (4) 
( ( 


10368 , 1080 , 9\(p\?" (7776 , 162 1458 


716 ALFRED BRAUER. 


According to (96), (97), (98), and (102), the Euclidean algorithm does 
not exist in the field P(p”) if 


4/2 3/5 
Suv = < ) + 

p 

7 


36297 8910 , 1458 


This is true for p > 3 300 000, since the polynomial 


1980 , 36297, 8910 1458 


3 300 =" 


(25 — — 


has only one positive root, and assumes a positive value for -( 7 


as can be seen by a simple computation. 


6. The case p < 3300000. en primes p < 3 300 000 must be investi- 
gated directly using the Theorems 5, 7, 1, and 2. This investigation is tedious 
but the methods are straightforward. 

If the integer 5 is a non-residue for such a p, then we cannot have an 
algorithm, according to Theorem 5, if p > 24-5? = 600. Analogously, we 
can argue for p > 24° 7, if (7/p) = —1, for p > 24-11%, if (11/p) =—1, 
and for p> 24-137, if (13/p) =—1. 

We can then form the 180 arithmetical progressions which contain those 
primes p = 24m + 13, for which 5, 7, 11, and 13 are residues. We consider 
those primes p < 3 300 000 using the modules 17, 19, 23,- - - as far as it is 
necessary for the construction of the least quadratic non-residue. For most 
of these p, it follows from Theorems 5, 7 or 2 that we have no Euclidean 
algorithm in P(p”). The only exceptional cases are for p= 13, 37, 61, 109, 
181, 229, and 421. For p13 and p = 37, the algorithm exists, as has been 
mentioned in the introduction. For p= 181, 229, and 421, there is no 
algorithm, as follows from Theorem 1, because of 


181 = 7-17 +2-31, 229—7-134 6-23, 421—13-19+ 6-29. 
Whether or not the algorithm exists for p = 61 and p= 109, I cannot decide. 

We have thus the result 

THEOREM 10. There is no Euclidean algorithm in the field P(p*), tf 
p > 109 is a prime of the form 24n + 13. 

THEOREM 11. Let p> 421 be a prime of the form 24n + 13, and u,v 
the two least quadratic non-residues (mod p), which are odd primes, Then 
we have 

< p. 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


g 
b 
0 
a 
h 
( 
el 
t 
( 
Tt 
n 
{ 
{ 
b 
( 
2) 
ne 
( 
Pr 
up 


POSTULATIONAL BASES FOR THE UMBRAL CALCULUS.* 
By E. T. Bett, 


As the somewhat condensed treatment of the umbral calculus which I 
gave elsewhere * has been misunderstood ? a fuller treatment than was given 
before is desirable. Incidentally, what follows validates the purely formal uses 
of this calculus, or of its special cases, which have appeared in the literature, 
when such uses give correct results. There are immediate generalizations to 
abstract commutative rings, obtainable by obvious modifications of the fol- 
lowing ; but as such generalizations seem to be of no use at present, it seems 
hardly worth while to develop them. 


1. Rational operations on umbrae. 


(1.1) Real, or complex, numbers are called scalars. The sign = denotes 
either definitional identity or identity as in algebra; which, will be clear from 
the context. 

(1.2) Scalars are denoted by small Latin letters with non-negative integer 
suffives, thus ty (N =0,1,---), or by small Greek letters, a, 8,---. As usual, 
the sum, product of any scalars «, 8 are « + 8, a8, and 0,1 have their usual 
meanings. 


(1.3) Latin capitals, A,- --,N,-- denote non-negative integers. 
(1.4) If wy (N=0,1,---) are any scalars, the one-rowed matrix 


(1.5) The (N +1)-th element, VY —0,1,-- -, of 2 in (1.4) is denoted 
by 


aN = ty (VN 
(1.6) The z in (1.4) is called an wmbra; x is the umbra of (2%, %,° °°, 
* or of the sequence zy =0,1,---). Note that an umbra has 


neither exponent nor suffix. 


(1.7%) Equality of umbrae is matric equality: if @ is as in (1.4), and 


* Received April 8, 1940. 

1 Algebraic arithmetic,” American Mathematical Society Publications, vol. 7 
(1927), pp. 146-159. 

*G. Temple, Journal of the London Mathematical Society, vol. 12 (1937), p. 114. 
Professor Temple has seen the present note, and writes (Feb. 21, 1938) that it clears 
up the obscurity. 


717 


718 E. T. BELL. 

Y = (Yo, * *), is equal to y, written z= y, if, and only if, 
ty = yy (N=0,1,---). Hence 

(1.71) 

(1.72) If then 

(1.73) If and yz, then 


(1.8) The coefficient of z,---apS7 in the expansion of (zx, 
by the multinomial theorem, is denoted by Ms,,...,5,. Note that exponents 
and suffixes 0,1 are to be indicated precisely in the same way as exponents 


and suffixes > 1. 


The next refer to rational functions of umbrae, and define ‘ umbral 
scalar multiplication,’ ‘umbral addition, etc. The qualification ‘ umbral’ 
will be dropped, as it is taken care of in the notation. 


(1.9) The scalar product, az, of « and x= (%,°*+,@y,* * *) is 


By definition, 7a = az. 
Now az is an umbra, by (1.6), and it is a compound symbol. To denote 
the (N + 1)-th element of a in accordance with (1.5), we write {ar} ; thus 


(1. 91) = ar’ = ary. 


Similarly, if * is any compound symbol of scalars and umbrae, and if * 
is an umbra, the (NV + 1)-th element of * is denoted by {*}%. 


(1.10) The sum, s,s = aa +---+-+ é2, of aa,- where 


@ == (do,° °),° °°, T= * 
is 

Hence 
(1. 101) faa = aay +--+ ++ 


(1.102) Addition, +, of umbrae is commutative and associative ; 


(1.103) There is a unique z, the zero wmbra, such that atz=~2 for 
every Z: 

(1.104) For every x there is a unique y such that z +y=2; y is called 
the negative of x; y= (—1)z, and is denoted by —z; 
(1.105) With respect to + the set of all umbrae is an abelian group; the 
inverse of z in the group is — x, and the identity of the group is z. 


( 
a 
( 
| 
I 
| — ( 
| 
0. 
(1 

of 
(1 
It 
it 
| 
the 
tog 
La 


POSTULATIONAIL BASES FOR THE UMBRAL CALCULUS. 


(1.11) If no two of a,---,a are equal as defined in (1.7), a,---,2 
are said to be distinct. In (1.12)-(1.125), a,- - -,a are distinct. 


(1.12) Ifa,---,« are distinct umbrae, (aa +- - --+ denotes the 
scalar py, 

(1.120) py=(aa+---+ 
(see (1.8)). In particular, 

(1. 121) Po = * * 2X. 
Hence, by (1.5), 


the left of which is called the N-th power of the sum aa +-- - - + é. Hence 
such powers are expanded by the multinomial theorem, and + is replaced by 
+ in the result. 

If py is as above defined, and p= (po,: - +, +), then = py, 
by (1.5). Note the distinction, as shown in (1.101), (1.122), between 


only the second of which is a power; both are scalars. 
By (1.121), 
(1. 123) (aa +--+ ++ 


In (1.122) replace N by N+ &. The resulting scalar, 
(aa shi she Ex) N+R, 
is called the product, 


(aa $ (aa $+ 
of 


(aa+-- 


It follows that this multiplication, -, is commutative and associative, and that 
it has the ‘identity’ (aa +- - +--+ éx)°. The right of (1.124) may be (and 
is) calculated from the left by expanding each of the factors ( )¥, ( )*” by 
the multinomial theorem, multiplying the resulting (scalar) polynomials 
together as in common algebra and finally degrading all exponents of small 
Latin letters to suffixes. For example, noting that = B® = 1, and =a, 
B' = B, since a, B are scalars, we have 


(aa + (aa + Bb)? 
= (aa'b® + Barb’) + 2aBa'b' + 


719 


720 E. T. BELL. 


= + 3a87a1b? + 
a 3a°Ba.b, 3aB7a,b2 -+- obs, 
— (aa + 


As a mere convenience of notation we write 


(1.125) + (Bb)® + (ye) 8] 
= (aa)™ + (Bb)P+- +++ (yc)8, 


the (scalar) sum of scalars on the right defining the expression on the left. 
Similarly for an infinity of scalar summands. 

All in this section (1.12) refers only to the case in which the 7 umbrae 
@,**+,z are distinct. The contrary case is equally important in applications 
of the calculus, and requires special consideration. 

(1.13) If in ae+-- --+ & there are precisely T summands az,: - -, é2, 
each of which is a scalar product of a scalar and x, we replace (—) the T a’s 


_ by T distinct umbrae, say a,- - -,a, in any order, and indicate this replace- 
ment by writing 
(1. 131) t+: 


Then (aa + ----+ é) is to be calculated by (1. 122), and the exponents are 
degraded, as in (1.120). In the result, each of a,---,x is replaced (<-) by z; 
the resulting polynomial is defined to be N-th power (ax + --- + &r)% of the 
sum az -+----+ 


For example, 


(aa + (aa + Bax)’; 
(aa + Bx)* = + 3a°Baor, + 

— + Bra, + 22 + ; 
(ax + Br)*® = + B*) xox, + + 


The relation (1.124) holds also for powers (ar +--+ when 
therein the replacements = are mace. 

Similarly, if ina (+) sum s there are precisely A summands each of which 
is a scalar product of a scalar and x,---, precisely C summands each of which 
is a scalar product of a scalar and w, and if these summands exhaust s, the 
S=A+4+-:-:-+C ws,:-+-,w’s, are replaced (—) by S distinct umbrae, say 
s—t. Then (t)% is calculated by (1.122), (1.120), and the final replace- 
ment (<)' of the § distinct umbrae by those introduced by (—). These 
powers (s)% also satisfy (1. 124). 


(1.132) Hence (1.124) holds for any umbrae a,---, 7, distinct or not. 
(1.14) 2% was defined in (1.5); it denotes the scalar zy. Hence, since 


i 
| 
| 
| 
} 
| 
j 
5 


POSTULATIONAL BASES FOR THE UMBRAL CALCULUS. 721 


multiplication of scalars is indicated (as always) by mere juxtaposition, 
without any symbol denoting the operation of multiplication, 


Since this multiplication is multiplication of scalars, it is commutative and 
associative. 

In (aa +-- defined in (1.120), take and each of the 
other scalars —0. Then by (1.9), (1.103), (0a +--+ -+ = (z)%, 
Note that ( ) is not omitted on the right. By (1.120), (7)’ =2%. Hence, 
by (1.5), =ay. By (1.124), (x) = (x)N*®, and hence, by 
what has just been shown, - — — zy,p, 


(1. 142) * = 


Thus, unless = A LR. The ‘ dot multiplication,’ -, is an 
operation peculiar to the calculus, and will be explicitly indicated where there 
is any possibility of confusion. 

Similarly, (aa +---+ without the dot, is the 
(scalar) product of the scalars (aa + &)%, (aa + &)®, which 
are defined in (1.222); and this scalar product is different from the dot 
product in (1.124). To see the difference in an example, we compare the 
example illustrating (1. 124) with the following: 


(aa + Bb)*(aa + Bb)?, 

= (a,b + Baob:) + 2aBarb, + Baobz), 

== + Bb ob, (2a,? + + (251° + bobo) + 
(aa + Bb)?» (aa + 


(1.15) A particular case of (1.120) occurs so frequently that a special 
notation is convenient. If s==ar -+---+ az is a sum of precisely A scalar 
products aa, we write 

(1. 151) 


There can be no confusion between the dot in A+ a# and that in (1. 124), 
since here the dot is between a scalar and an umbra, while in (1.124) it is 
between two scalars. If desired, the dot in (1.151) may be circled, thus ©. 
It would be incorrect to write Aax instead of A: ax, since Az is a scalar, and 
hence, by (1.9), Aux is a scalar product. 


(1.16) Umbral multiplication can be defined in many (actually, an infinity 
of) ways to yield algebras simply isomorphic with parts of the common algebra 
of scalars, for example rings. Here we need mention only that species of 
umbral multiplication which is directly applicable to the power series in § 2. 


It will not be used in the sequel. 


722 E. T. BELL. 


(1. 161) = (X/0!,4,/1!,-- +, 2v/N!,° 
is said to be of e-type (e==‘exponential’). Hence, if y is of e-type, 
yN =yy/N!. If w is not of e-type, it is replaced by in which /N!, 
until after all calculations involving # have been completed, when wy is 
replaced by N! wy. 

Let r= +), y=(y/0!,°° +, be 


of e-type. The product, xy, of x,y (in this order) is the matrix p which is 
such that 


_ ((t+y)° (x+y)? (e+ y)% 
(1. 163) y= (SE, 


Hence umbral multiplication is commutative and associative. Thus 
powers may be defined as usual; the A-th power of x is denoted by x’, to 
distinguish it from #4, 


2. Power series. The set of all (formal) power series in the variable 6 
is closed under the four rational operations. Division is immediately referred 
to multiplication, and need not be separately discussed. Irrational functions 
of these power series also occur, but as they are of less interest than the rational 
functions, and are readily investigated if desired, they will not be considered 
here. The use of formal (disregard of convergence) power series can be justi- 
fied in detail, if not obviously legitimate in the present connection (for example, 
as in my paper, Transactions of the American Mathematical Society, vol. 25, 
1923, 135-54) ; however, there is sufficient generality in the set of all power 
series in 6 convergent in the same domain | @|>0 to show here how the 
definitions, etc., in §1 give immediately the algorithms of Blissard’s umbral 


calculus. 
If x= * *) We write 
(2.1) ay(OX/N!), 
/=0 


where e has its usual meaning (2.7: --). Thus, by (1.5), 


(2. 11) aN (@N/N!). 
=0 
By either of these, ge” is a scalar. Hence if A(g,- - +,7) is a polynomial in 


é,- +,» with scalar coefficients, A= A(ée”,- - -, is a scalar, as is also 
the N-th derivative 0,VA of A with respect to 6. By writing A as a MacLaurin 


St 


| 
| 
( 
( 
| ( 
i 
| 
{ 
| 
( 
a 
| 
| 


POSTULATIONAL BASES FOR THE UMBRAL CALCULUS. 723 


series In 0, we express it in the form re’, and similarly for the derivative. 
For any A (or its derivative) the appropriate re’ is built up by repeated 
applications of the elementary identities (2.2)-(2.4) in @. 


. . 
(2. 2) — — (c+ y)X(ON/N!), 
0 


which, by (1.120), is merely the formal multiplication of two MacLaurin 
series to produce a third. Generally, for any number of factors on the left, 
. 
(2. 21) — > (éx + ny) N(0N/N!). 
0 
For addition, (1.101) gives 

(2.2) == {2 + y}N(6N/N!), 

0 
with the obvious extension to any number of summands. 


Powers are obtained directly from (2.21), or more conveniently thence 
by (1.151): 


(2.3) == tA. => (A+ 


For derivation, we have 


M=0 
= > y (OM/M!), 
M=0 
=y (QM 1) [by (1.5) ], 
M=0 
= D [by (1.124) ], 
M=0 


= (ér)N- (ée™)(0"/M!) [by (1. 125)], 
M 


=Q 


== - 
(2. 4) (ér ) N. 


in complete formal analogy with derivatives of ordinary (scalar) exponential 
functions. From (2.3), (2.4), 


(2. 5) ]4 = (A Ex) 809; 
and from (1.101), (1. 120), 


The coefficient of 6V/N! in the MacLaurin expansion of the left of (2.6) is 
in fact 


E. T. BELL. 
N 
(ge + (aa 440})%, 


which is the coefficient of 6V/N! on the right of (2.6). 
Many of the more interesting applications to special sequences of numbers 
(like the Bernoulli or Euler numbers), arise in the following simple way. Let 


&,° * 
be a rational function of 6,%,- - -,y in its lowest terms. Replace a,-: - -,y 
by ae%,- - -, ye”, and let the MacLaurin expansion of the result be 
A(6, wer, ev’) 


(6, ye") 
thus defining the numbers zy (N —0,1,---). Let the MacLaurin expansions 
of A, ® be 

thus defining yy, uy. Hence 
nev? 
ny’ = + u)%. 


Hence, if #(6) is a polynomial in 6, or a power series, if convergent, 


F(6@+y) =&F(6+2+4), 
in which, after expansion, exponents of y, x, u are degraded to suffixes. 

In practice, the special notations { }, +, ()-(), '4:°a# are dropped, 
+, ()(), Aa« being written, as the notation is a sufficient guide to the 
correct use of the algorithms. There are many extensions, in particular one 
to multiple suffixes, as in 74,2,...,c, and the corresponding power series, 


+7. 


Finally, everything down to (2.6) goes through unchanged if scalars in 
(1.1) are re-defined to be elements of any commutative ring with a modulus 
(= identity with respect to multiplication). 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 


op 
pa 
In 
Ww 
als 
of 
me 
ele 

im 
fiv 
Fi 
the 
fo 

d 

of 
pa 
aly 
fro 

gre 
a, 
| 
i 

| 


THE ABELIAN QUASI-GROUP.* 


By Harriet GRIFFIN. 


Introduction. The purpose of this paper is to investigate the abelian 
quasi-group, which is a commutative system of elements closed under a single 
operation, when certain conditions with respect to coset expansions are imposed 
by stated associative laws as explained in the first section. Throughout the 
paper we show how the abelian quasi-group differs from the abelian group. 
In section two we study in particular the minimal quasi-group of units both 
when each element is the unit for an element of the minimal quasi-group and 
also when this is not the case. In sections three and four we study the orders 
of elements in the case in which the minimal quasi-group is the identity ele- 
ment, and develop a method for setting up a quasi-group with an identity 
element and no subquasi-group other than the identity. We show that two 
conformal abelian quasi-groups need not be isomorphic. 

The subquasi-groups of an abelian quasi-group under the conditions 
imposed form a Dedekind structure only in a special case as shown in. section 
five and thus the abelian quasi-group differs greatly from the abelian group. 
Finally in section six we determine a necessary and sufficient condition that 
the cosets of an abelian quasi-group under the imposed associative laws shall 
form a quotient quasi-group. 


SECTION I. 
Associative laws of the abelian quasi-group. 


To facilitate the reading of this paper, we begin with a connected account 
of certain fundamental properties of quasi-groups drawn largely from the 
paper of Hausmann and Ore! upon which our work is based. We do not 
always follow their exact wording, but we believe that any essential departure 
from their presentation is clearly indicated. 


1. Fundamental definitions and notions of the finite abelian quasi- 
group G@. A groupoid is a system consisting of a set of distinct elements 
a,b,--+ and one binary operation (multiplication) such that to every ordered 


*Received September 1, 1939; Revised November 20, 1939. 
1B. A. Hausmann and Oystein Ore, “Theory of quasi-groups,” American Journal 
of Mathematics, vol. 59 no. 4. (October, 1937), 


725 


= 


726 HARRIET GRIFFIN. 


pair of elements a, b, there corresponds a unique third (the product), c= ab, 
of the set. 

If, further, to each ordered pair a,b there corresponds a unique @ such 
that az = b, and a unique y such that ya = b, the groupoid is called a quasi- 
group. 

Since we are here interested in the abelian quasi-group, we impose the 
added condition that ab = ba. Then the quotients + and y above are equal. 

No identity element need exist. However for each element a there is a 
unique é@ such that aeg =a, called the unit for a. 

If a is an element of Q, we define the powers of a as a~—a for n=1 
and a" =a""-a for n >1. The order of a is then the least positive integer 
n for which a” = eé,. Such a finite power of a exists, since Q is assumed finite. 

Let a, b,---,k be any subset of Q. Then there is a least subquasi-group 
of Q which contains the elements a,b,---,k. We denote this subquasi-group 
by {a,b,---,k}. In particular the quasi-group {a} generated by a single 
element is called a cyclic quasi-group. It is to be noted that {a} may contain 
elements other than the powers of a. , 


2. Fundamental properties of Q. We wish the abelian quasi-group Q 
to have certain properties and hence impose the following conditions. 

The expansion of Q by means of disjoint cosets with respect to any 
subquasi-group A is to exist, qnA, where the 
are in Q but not in A. The associative law expressing a necessary and sufficient 
condition for this property as proved by Hausmann and Ore is: 

P,. Ifaand b are any elements of Q and cy and dy are determined so that 


(ab) = ado, 
then for any ¢ 
(ab)c =ad, 


where d is an element of {c¢,, do, c}. 

Any element of a coset is to define the same coset. Again a necessary and 
sufficient condition for this property as proved by Hausmann and Ore is: 

P,. For any elements a and b of Q 


(ab)c = ad, 
where d belongs to {b,c}. 

It then follows from P, that c(bC) = (bC)c = bC = (cb)C for any ¢ 
containing c, and consequently each subquasi-group C of Q contains all the 
units of Q and each coset aC contains its multiplier a. Hence there is 4 
subquasi-group of Q contained in all subquasi-groups of Q and containing all 
the units of Q. We call it the minimal subquasi-group EF of Q. When Q con- 


i 
i 
g 
| 
| 
c 
f 
| 
£ 
4 
| 
4 
| . 


THE ABELIAN QUASI-GROUP. G27 


tains an element a such that a? =a, the minimal subquasi-group consists of 
« which is then the identity element. It is to be noted that the minimal quasi- 
group F is generated by any one of its elements, but it is not necessary that 
each of its elements be a unit. 

Furthermore it is important to notice that P, implies P». 

The decomposition of Q into cosets is to be transitive, and herein we 
depart from the interpretation of transitivity given by Hausmann and Ore. 
By transitivity we mean that for every A and B such that Q >A>B, 
Q=A+qA+:::+qA, and the expansion 
of Q/B can be obtained by substituting the expansion of A/B in the expansion 
of Q/A. A necessary and sufficient condition for this property is: 

P,. When gq is an element of Q > {a,B} but is not in {a, B}, then 
for 6 in B, 

q(ab) = (qa) bi, 
where b, is generated by b. 


Proof: (a). Ps, is a necessary condition. 
Let 
A=B+a,B+::-+a,B, 
and 
where 9 >A>B. 
Then for transitivity of coset decomposition 


+: +++ Qqn(arB). 


If e is the unit for aj, qi(aje) = qia; and is in qi(a;B). But since a coset is 
generated by any one of its elements, 


qi(ajB) = (qiaj) B. 


However for b in B, {b} = 8B. Hence, as for B, qi(aj{b}) = (qiaj({)}; 
so that for g in Q and not in {a, B}, q(ab) = (qa)b, where b generates D,. 

(b). PP,» is a sufficient condition since as 6 varies over the elements of 
any B, b, varies over B, and thus q(aB) = (qa) B. 

It is to be noted that P, does not govern all the products of elements of 
a quasi-group. Transitivity of coset expansion is not different from coset 
expansion except when applied to the product of elements of a proper subquasi- 
group by an element not in that subquasi-group. This fact is exhibited by 
Table X where having determined suhquasi-group H# as 1, 2, 3, and A as 
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, P. governs blocks of products like 13(42), 
but it does not govern 7(4/) nor any of the products of 4 through 12 by any 


728 HARRIET GRIFFIN. 


of 4 through 12. This last set of elements forms what we shall call a free 
square in the multiplication table because the products are not governed by P.. 


TABLE X. ABELIAN QUASI-GROUP OBEYING P, AND P,. 
10 1112 131415 161718 19 20 21 22 23 % 


11 12 10} 13 14 15 | 16 17 18 | 19 20 21 24 22 | 
10 11 12 | 14 15 13 | 18 16 17 | 20 21 19 22 23) 
12 10 11} 15 13 14 | 17 18 16 | 21 19 20 23 24) 


19 16 22 2013 23 17 24 18 14 15 
20 17 23 21 14 24 «18 22 16 15 13) 
21 18 24 19 15 22 16 17 13 14) 


16 20 13 19 24 15 
17 21 14 20 22 13 
18 19 15 21 23 14 


23 13 15 24 
24 14 (13 22 
15 14 


12 
1] 


— 
— 


PH © WA] 


woe wl — 


_— — 
“Im 


nw — Ce 
SCN O PR @ 


—" 


— 
on 


— 

on = 


tb 


— 
© 

— 
— 


to 
to 


tb 

for) 


1 

2 

3 
16 | 16 8 
17 | 17 13 9 
18 | 18 7 
19 | 19 17 6 
4 


© 
© 


t 
to 


DONO 


CON 
One NOW AID 


bo 
bdo 


SECTION II. 
The subquasi-group of units. 


In this section we shall omit all proofs which refer to special tables. 
They may be found in the complete paper. 


1. The Cayley square. It is to be noted that due to the symmetry of 
the Cayley square of an abelian quasi-group, if an element is not placed in 
the principal diagonal space of a row, the placing of the element causes a 
second row to be supplied with the element. On the other hand an element 
placed in the principal diagonal space of a row eliminates just that row. 
Hence if the order of the quasi-group, which is the number of elements it 
possesses, is even, an element which appears in the principal diagonal appears 


the 

eac 

suc 

1 3 

3 2 me 
2 1 the 

giv 

14 17 21) for 

15 18 19| 

13 16 

17 21 18) 

18 19 mo 

| 

| 4 int 

{ 6 4 

5 6 4! 

10 11 12| 

12 10 11) 

11 12 10; 

23 1/8 an 

l 2 3) 

3 

é 

of ¢ 

i 8 9 

| 9 7 m 

7 8 9} wit 

i 

| 

cye 

| 

the 

i whi 


THE ABELIAN QUASI-GROUP. 29 


there an even number of times. If the order of the quasi-group is odd, however, 
each element appears just once in the principal diagonal. 


2. The quasi-groups €,. We consider first the minimal abelian quasi- 
group consisting of more than one element, i.e., where there is no element 
such that a? =a, and such that each element is the unit for one of the ele- 
ments of the minimal quasi-group. We denote this quasi-group by €,, while 
the general minimal quasi-group is denoted by F or E, where the subscript 
gives the order. 

Since any element has but a single unit, and since in €, each element is 
the unit for one element of €,, when a is the unit for b, b must be the unit 
for some c, c for d, etc. If we continue in this fashion and include all the 
elements of €, upon returning to a, we shall say that the units set up a single 
cycle. Otherwise it is clear that the set of units will be separated into two or 
more cycles. It is evident, however, that no cycle may have but two elements 
since if a is the unit for b, ab =b, and then 6 cannot be the unit for a. Hence 


in the case of €; there can be but a single cycle. 
THEOREM 1. There are no €, of order 2 nor 4. 
THEOREM 2. An €, exists for every n not 2 nor 4. 


Proof. By first setting up the cycle of units, 2 for 1, 3 for 2,---,n for 1, 
and then the principal diagonal, we have developed general methods for build- 
ing abelian quasi-groups of any odd order, of odd order greater than 5, and 
of even order greater than 6. &, is a special case. There is no subquasi-group 
in these tables since the unit for an element must be in any subquasi-group 
with it. 


THEOREM 3. All €; and all €; are abstractly identical. 
Proof. Having chosen the units the tables are uniquely determined. 
THEOREM 4. The &, are abstractly identical. 


Proof. ‘The set of units must form a single cycle, and the two seemingly 
distinct quasi-groups that it is possible to build on the cycle of units are 
cyclic permutations of one another. 


THEOREM 5. Although the set of units for €; must form a single cycle, 
the €; are not abstractly identical. 


THEOREM 6. The €, where n> 7% are not abstractly identical. 


Proof. For any even order we can set up an abelian quasi-group €, in 
Which the set of units is broken into cycles 1 through n—3 and n—2 


4 
| 
| 
3 


730 HARRIET GRIFFIN. 


through n. For odd integers (not 3 nor 7) of the form 4n.+ 3 a second 
method shows how to break the set of units of Esn,; into cycles 1 through 
2n + 1, and 2n + 2 through 4n + 3, while for odd integers greater than 5 
of the form 4n +-1 we can set up an €4n,, with the set of units broken into 
the cycles 1 through 2n + 2, and 2n + 3 through 4n-+ 1. In all cases the 
diagonal elements are so chosen that there is no subquasi-group. 

These methods together with Theorems 2 and 5 show that there are at 
least two distinct €, for any n = 7 since a quasi-group in which the set of 
units is broken into two cycles cannot be isomorphic with one in which a single 
cycle includes all elements. 


3. The order of an element of the minimal quasi-group. 
THEOREM 7%. No element of E, can be of order 1 nor n. 


Proof. The order of an element cannot be one since 47 Aa. It cannot 
be n since the element for which a is the unit cannot occur among the powers 
of a. 


THEOREM 8. In a minimal quasi-group E, if an element e is the unit 
for just s = 0 elements of E, the order of e is at most n —s, and tt cannot 
be n—s—1. 


Proof. Let e be the unit for s distinct elements. Then since none of 
these s elements can be powers of e, the order of e is at most nm —s, and may 
be n —s if the powers of e exhaust the n —s elements for which e is not the 
unit. If the order of e were n —s—1, the remaining element when multi- 
plied by e would have to give itself and e would be the unit for s + 1 ele- 
ments. On the other hand if the order of e is less than n —s—1, e need 
not be the unit for any of the remaining elements. 


Corottary. In EF, an element cannot be the unit for n—1 elements. 


When n is 6 or 7%, we have set up /, in which there are elements which 
are units for one or two elements and which are of all orders not excluded by 
Theorem 8. We also have examples of elements which are not units for any 
element of Z, such that their orders include all integers from 2 through 
except n—1. 


SECTION III. 


The abelian quasi-group with an identity element and no subquasi-group 
other than the identity. 


1. Order of I,. If an abelian quasi-group of order n has no subquas!- 
group except the identity element 1, we designate it by In. 


id 


e 
tl 
| 
as 
Ww 
n 
0 
1 
| 
| 
Y 
t} 
if 
| | 
a 
a 
W 
t 
i a 
| 


THE ABELIAN QUASI-GROUP. 731 


THEOREM 1. There are no abelian quasi-groups In of orders 2, 3, nor 5 
except when they are groups, and there are none of even order > 2. 


THEOREM 2. An I, exists for every odd order > 5. 


Proof. Set up the multiplication table of a cyclic group by putting 1 
through n in the first column and row and by filling in all diagonals from the 
upper right to the lower left with the number in the first row of the diagonal. 
Make the last column n, 1, 2,3,---,—41, and carry the diagonals through 
as before. Then by the method of formation the powers of 2 are 2, 3, 4,---,n, 1, 
so that 2 is always of order ». Now interchange the elements of the last row 
with the principal diagonal elements immediately above them. This operation 
makes n the identity, so call 1, n and n, 1 throughout the table. The order 
of 2 remains n. The result is quasi-group L. 

There is no subquasi-group other than 1 since for any element except 
1 and 2, a* =a —1, and hence any element not 1 generates 2. But 2 is of 
the n-th order. Therefore each element except 1 generates the quasi-group L. 

Furthermore the quasi-group is not a group since 2(3-4) =1, while 
(2-3)4 = 8 and this part of the table remains for all orders greater than 5. 


2. The order of elements of L. The manner of forming LZ makes it 
possible to find the orders of its elements. 


THEOREM 3. The order of every element a except 1, 2, and n of the 
qe “-group L of order n set up by Theorem 2 is n or n/g +1, where g 1s 
the greatest common divisor of a—1 and n, according as a—1 is or ts not 


prime to n. 


THeorem 4. The order of the element n of quasi-group L is less than 
n except when n is a prime and 2 1s a primitive root mod n in which case 


the order of n is n. 
3. The order of the elements of I,. 
THeEorEM 5. There is no element of I, of order 2 nor n—1. 


In some cases it is possible to set up a quasi-group lacking a subquasi- 
group except 1 without interchanging all the elements of the principal diagonal 
and the last row as in Theorem 2. In particular except when n is a prime 
and 2 is a primitive root mod n, there is a set of less than nm interchanges 
which include the 1 and nm which always gives a quasi-group. We have used 
these methods to set up J, when n is 7, 9, and 15 and these quasi-groups 


afford examples of elements of every order except 2 and n—1. 


732 HARRIET GRIFFIN. 


SECTION IV. 
The abelian quasi-group in general. 
We turn now to some facts about the general abelian quasi-group Qn. 


THEOREM 1. Jf @ quasi-group YQ, has an element of order n—1, the 
element generates the quasi-group and there is no identity element. 


THEOREM 2. Two quasi-groups of the same order and having elements 
cf corresponding orders need not be abstractly identical. 


Proof. We demonstrate this fact by an illustration. Take R with the 
elements 1, 2, 3, 4, 5, 6 such that the products of these elements in order by 
1 are 5, 2, 3, 1, 6, 4; by 2 are 2, 6, 1, 4, 5, 3; by 3 are 3, 1, 4, 5, 2, 6; by 4 
are 1, 4, 5, 6, 3, 2; by 5 are 6, 5, 2, 3, 4, 1; and by 6 are 4, 3, 6, 2, 1, 5. 
Take S with elements 2, 3, and 4 as above but let the products for 1 be 
6, 2, 3, 1, 4, 5; for 5 be 4, 5, 2, 3, 6, 1; and for 6 be 5, 3, 6, 2, 1, 4. Since 3 
is in each the only element of order five, if the quasi-groups were isomorphic, 
these elements would have to correspond to each other. But then 4 of R must 
correspond to 4 of 8; 5 to 5; 2 to 2; and 1 to 1. But in R 1-1=—5, while 
in 1-1—6., 

This fact is interesting since we recall that two abelian groups which are 
conformal are isomorphic. 

Under certain conditions which we discuss in Section VI the cosets with 
respect to a subquasi-group B of Q always appear in blocks throughout the 
multiplication table due to the fact that Q/B forms a quotient quasi-group. 
In this case the cosets themselves form an abelian quasi-group with an identity 
element consisting of the subquasi-group #, and since the blocks must combine 
as do any elements of the blocks, the blocks obey the associative law of the 
original elements. We can under these circumstances make certain general 


statements about the order of elements of Q. 


THEOREM 3. The order of any element of Qn with blocks of cosets with 
respect to a subquasi-group By throughout the table may be only: 

a. Those orders permitted in the identity element I= B of the quotient 
quasi-group Q/B. 

b. 1,2,:--+, or m times the order of any block except I of Q/B. 

If the cosets themselves form a quasi-group without a subquasi-group 


except J, the theorems of Section III apply. But what is the order of a coset 
of the quotient quasi-group if there are subquasi-groups other than J? This 


j 
| 
ql 
f 
( 
H 
\4 
i 
if 
i 
f 
} 
} 
j 


THE ABELIAN QUASI-GROUP. 733 


question is the same as asking: What is the order of an element of a quasi- 
group of order » with an identity element where there may be subquasi-groups 
other than 1, and where the blocks of cosets with respect to any subquasi-group 
need not appear throughout the Cayley square? When there is an identity 
element, beyond the fact that the order of an element cannot be n — 1, and 
that if n is odd, there can be no element of order 2 (due to the coset expan- 
sions), we can state no law which the order of an element obeys. On the other 
hand if Qn» has a minimal subquasi-group of units ZL < Q, and the block 
formation does not prevail throughout the multiplication table, no elements 
of F are of order n —1, and the order of an element a not in £ cannot be 
n—J1, since if b were the element not among the n —1 powers of a, ab =), 
and then a would be a unit. 

Herein lies the difference between the multiplication table of the abelian 
group and the abelian quasi-group, for the cosets of an abelian group always 
form a quotient group and hence the elements of the cosets form blocks 
throughout the multiplication table. But more than that, due to the associative 
law a(bc) = (ab)c, the elements take the same positions in the blocks through- 
out the table. It is this orderly characteristic of the Cayley square for the 
abelian group which distinguishes it from that of the abelian quasi-group. 


SECTION V. 
Structure theory. 


1. Structure definition is satisfied. A structure is a partially ordered 
system in which every two elements have a union and a cross cut. In the 
case of the quasi-group @ the union [A,,-°--, An] of subquasi-groups of ( is 
the smallest subquasi-group which contains each A;, while the cross cut 
(A,,:-:,An) is the largest subquasi-group contained in every Aj. It is evi- 
dent that the subquasi-groups of an abelian quasi-group form a finite structure. 


2. Dedekind structure. In order that a structure be a Dedekind struc- 
ture, it must satisfy the axiom: 

When A, B, C are any three elements of a structure such that 
A<C< [A, B], then 

C = [A, (B,C) ]. 

When the abelian quasi-group has the properties given by P?, and P., the 
subquasi-groups do not in general satisfy this axiom. To show that it is vio- 
lated we use the abelian quasi-group W of twenty-four elements built as the 
direct product of two of its subquasi-groups. We let A be the subquasi-group 
of elements 1, 2, 3, 4, 5, 6; B be 1, 2, 3, 13, 14, 15; C be 1, 2, 3, 4, 5, 6, 7, 


734 HARRIET GRIFFIN. 


8, 9, 10, 11, 12; while the minimal subquasi-group of units is 1, 2, 3. Although 
[A,B] = W, [A, (B,C)] C. Hence the subquasi-groups of W, which 
obeys the laws of coset expansion and transitivity of coset decomposition, do 
not in general form a Dedekind structure. 

The simple case where every subquasi-group of Q belongs to the single 
principal chain Q > A, > A.-+: > # is an exception since in this case every 
element of [A;, Az] is an a, and the Dedekind axiom must be satisfied by the 
method of the following Theorem 2. 

In view of the fact that the subgroups of an abelian group always form 
a Dedekind structure and since coset expansions and transitivity of coset 
decomposition are such outstanding properties of the group, it is interesting 
to find that the subquasi-groups of an abelian quasi-group having these 
properties fail to form necessarily a Dedekind structure. 

If, however, we strengthen P. to read: 

P;. For any three elements a, b, c of Q 


a(bc) = (ab)c,, 

where c, is an element of {c}, we have, as proved by Hausmann and Ore, a 
Dedekind structure. 

THEOREM 1. When P; holds, the elements of [A, B] take the form ab. 

Proof. Any a=aeg. Furthermore: 

(a,b,) (a2b2) ( (a:b; ) a2) bs ( (a:a2) bs) bs a3 (b4bz) = dsb; 

where b, is generated by b.; by etc. 

It follows then as proved by Hausmann and Ore that: 


THEOREM 2. When A> B, the abelian subquasi-groups A, B, C of @ 
obeying P; satisfy the Dedekind relation 


(A, [B, C]) = [B, (A,C)]. 


Therefore, as pointed out by Hausmann and Ore, the analogues of the 
Zassenhaus-Schreier refinement theorem, of the Jordan-Holder theorem on 
the invariance of the \engths of principal chains, and of the Schmidt-Remak 
theorem on direct decomposition must hold for the subquasi-groups of an 
abelian quasi-group obeying P;. We can say, as does Ore with respect to the 
group, that in these respects the theory of the abelian quasi-group is more 4 
property of the subquasi-groups than of the elements themselves. 


if 
| 
HI 
by 


‘HE ABELIAN QUASI-GROUP. 735 


SECTION VI. 
The quotient quasi-group. 


1. The abelian quotient quasi-group Q/B. When all and only the ele- 
ments of a subquasi-group B of Q occupy the upper left-hand corner of the 
Cayley square of (, due to P, there are always blocks of cosets of B at the 
top and left side of the table of Q. If these blocks are maintained throughout 
the table, they form an abelian quotient quasi-group with the identity element 
B. In order that blocks of cosets with respect to B be maintained throughout 
the table of Q any element of the coset g,8 when multiplied by an element of 
q.B' must give another coset and that coset must be the one that contains 
9igz, Which is (qig2)B. Conversely if (qi:B)(q2B) = (qiq2)B, there are 
blocks of cosets with respect to B throughout the table of Q. Hence 
(q,:B)(q2B) = (qig2)B is a necessary and sufficient condition for blocks of 
cosets with respect to the subquasi-group B throughout the multiplication 
table of Q which obeys P,; and P2. 


2. The quotient quasi-group under P;. Under the law P; we have: 
(aC) (bC) = ((aC)b)C = ((ab)C)C = (ab)C. 
Furthermore if (aC) (UC) = (ab)C, by applying P; we have: 
(ab)C = ((bC)a)C. 
But a(bC) is a coset with respect to C, and if it is multiplied by an element 
of C, it must give one of its own set. Therefore 
(ab)C =a(0C). 

Thus if C = {c}, for any a, b, c, (ab)c =a(bc,), where ¢ generates c,. 

In like manner if we assume for any 4, b, c, that (ab)c=a(bc,) where 
¢, is generated by c, it follows that: 


(ab)C = ((ab)C)C = ((bC)a)C = (0C) (aC).? 
Furthermore 
a(bC) = (a(bC))C = (bC) (aC) = (ab) C. 
Hence it follows that: 


THEOREM 1. I/f for every a, b, c, (ab)c=a(bc,) where c, is generated 
by c, then there is a quotient quasi-group with respect to any subquasi-group 


C and further c is generated by ¢,. 


? Hausmann and Ore, loc. cit. 


736 ILARRIET GRIFFIN. 


3. The quotient quasi-group under P.. If the abelian quasi-group obeys 
the law P., we do not in general have blocks of cosets throughout the table 
since there are the free squares which are not governed by P,. However it is 
interesting to notice that P, forces products of an element not in a subquasi- 
group B of Q by the elements of a coset of B/E where FE is the minimal 
subquasi-group to give the elements of a coset of B/E. 

Consider subquasi-groups B and / of Q such that FE << B< Y. Then P, 
applies to the product of g not in B by any two elements of B and q(bc) 
= (qb)c,. When all the cosets of Q/H have been determined, the product qb 
is determined only in the case where 6 is an ¢. Take be of the coset dL. 
Let qb = d which may not be in gf nor in b/. Then q(be,) = d = (qb) eq. 
For another e, q(bes) = (qb)e,—=de,, and e;~ eq. Hence as es varies 
through the e, so does e,, and de, must vary through the elements of dL. 
Hence the products of q by bE give another coset dE. These cosets must then 
combine to form cosets with respect to the next larger subquasi-group. 

It is to be noted, however, that another element of gZ when multiplied 
by bE may give the elements of a coset different from dZ. This fact is 
exhibited by quasi-group 4’, and shows that blocks of cosets with respect to L 
need not exist in this part of the multiplication table. 

Let C be a cyclic subquasi-group next larger than /. Then all elements 
of C not in £ generate C. When we assume q(bc) = (qgb)c, where q is not in 
{b, C}, if c is in EF, then c, is in / and one generates the other. But if ¢ 1s 
in C and not in F, then too c, must generate c, and as ¢ varies through these 
c’s, c, must vary through the same c’s. After repeated applications of this 
argument we may conclude: 


THEOREM 2. For q not in {b,C} and ¢ in C, q(bc) = (qb)e, where c 
generates c, implies that c, generates c. and where ¢, generates c, ¢ generates ¢). 


If we assume P,, how may we strengthen this law in order to have a 
quotient quasi-group with respect to any subquasi-group of Q? First consider 
the case above where qb =d and which showed that the elements q(b/) 
form a coset with respect to E but that blocks of cosets are not necessarily 
formed. We may take these as the elements of a row. To complete the block 
we consider that qb = bq and b(qeq) = (bq) eq. Then if we change éq, ér 
varies through the coset g/ and the products are to give dE so that b(qer) 
must give (bq)es where e, varies through all the e’s. Hence if the column is 
to be dE, b(qe) = (bq)e:. 

If we consider the blocks which are to be obtained by multiplying q, by 
gE where neither qg, nor q is in B, a like argument permits us to conclude 


that if we have a quotient quasi-group with respect to #, then a(be) = (ab)e: 


for any a, b, e. 


4 
i 
} 
i 
if 
4 
ti 
bay 


THE ABELIAN QUASI-GROUP. 737 


In order to obtain a condition for a quotient quasi-group with respect to 
every subquasi-group, we take C > # with no subquasi-group between C and LF. 
Then as noted, any c in C but not in FH generates C. We assume that we have 
blocks with respect to #. Then (qgC)(qiC), where q and q, are not in C, 
must give (qq:)U, i.e., g(qiC) must equal (qq,)C. But q(qie) = 
Therefore g(qic) = (qq:)¢, where ¢ generates c,, for any c of C but not in £. 
If we call these elements a set of a row, q,(qC) will fill the column and give 
again c, generated by c. By repeating this process we see that: 


THEOREM 3. When an abelian quasi-group satisfies P, and P2, if a quo- 
tient quasi-group exists for every C, then for every a, b, ¢, 
a(bc) = (ab)c,, 
where c generates c,. 


Together with Theorem 1 of this section we now have: 


THEOREM 4. When an abelian quasi-group satisfies P, and P2, a neces- 
sary and sufficient condition for a quotient quasi-group with respect to every 
subquasi-group CU of Q is: 

For any a, ¢, 

a(bc) = (ab)c,, 
where c generates 


NEw YoRK UNIVERSITY. 


REFERENCES. 


Oystein Ore, “On the foundation of abstract algebra,” I, Annals of Mathematics, 
vol. 36 (1935), pp. 406-437; II, ibid., vol. 37 (1936), pp. 265-292; “Structures and 
group theory,” Duke Mathematical Journal, vol. 3 (1937), pp. 149-174; “On the appli- 
cation of structure theory to groups,” Bulletin of the American Mathematical Society, 
vol. 44 (1938), pp. 801-806. 

Hausmann and Ore, “ Theory of quasi-groups,” American Journal of Mathematics, 


vol. 59 (1937), pp. 983-1004. 


THE GAUSSIAN LAW OF ERRORS IN THE THEORY OF ADDITIVE 
NUMBER THEORETIC FUNCTIONS.* ? 


3y P. Erpés and M. 


The present paper concerns itself with the applications of statistical 
methods to some number-theoretic problems. Recent investigations of Erdis 
and Wintner? have shown the importance of the notion of statistical in- 
dependence in number theory; the purpose of this paper is to emphasize this 
fact once again. 

It may be mentioned here that we get as a particular case of our main 
theorem the following result: 

If v(m) denotes the number of prime divisors of m, and K, the number 
of those integers from 1 up to nm for which v(m) <lglgn+oV2lelgn 
(» an arbitrary real number), then 


n>o0 


f exp (— u?) du. 
-00 


This theorem refines some known results of Hardy, Ramanujan * and 


Erdés.* 
1. In what follows p will denote a prime and o will denote a real number. 


Let f(m) be an additive number-theoretic function, so that f(mn) 
=f(m) +f(n) if (m,n) =1. Suppose that f(pt) =f(p) and | f(p)| $1. 
Obviously 


f(m) = =f(p). 
p\m 


Furthermore put p“f(p) = Anand ( & p7f?(p))%—=B,. Then our main 
Don pon 
theorem may be stated as follows: 


* Received December 7, 1939. 

*A preliminary account appeared in the Proceedings of: the National Academy. 
vol, 25 (1939), pp. 206-207. 

?P. Erdés and A. Wintner, “ Additive arithmetic functions and statistical in- 
dependence,” American Journal of Mathematics, vol. 61 (1939). pp. 713-722. 

* Srinivasa Ramanujan, Collected Papers (1927), pp. 262-275. 

*P. Erdés, “Note on the number of prime divisors of integers,’ Journal of the 
London Mathematical Society, vol. 12 (1927). pp. 508-314. 


738 


L 


| 
th 
T 
< 
q 
| 
f 
4 0 
t] 
q 

| i 
if d 

t 
q f 
|| 


THE GAUSSIAN LAW OF ERRORS. 739 


THEOREM. Jf B,— © asn— w, and K, denotes the number of integers 


m from 1 up to n for which 


f(m) < An +oV2 Bn 


then 
lim = rt exp(— u?)du= D(o). 


2. We first prove the following 


LEMMA 1. Let 
fi(m) f(p). 


p<l 


Then denoting by 8 the density of the set of integers m for which fi(m) 


<Ai+ wV 2 Bi one has 
lim 6; = D(w). 


1-00 


Let p,(n) be 0 or f(p) according as p does not or does divide n. ‘Then 
fi(m) = & pp(m). 
p<l 


Since the pp(n) are statistically independent, f:(m) behaves like a sum of 
independent random variables and consequently the distribution function of 
fi(m) —Ai/V 2B: is a convolution (Faltung) of the distribution functions 
of pp(m) — pf (p)/V2 Br® (p<). It is easy to see that the “ central limit 
theorem of the calculus of probability ” can be applied to the present case,* 
and this proves our lemma. 


3. Lemma 1 is the only “statistical” lemma in the proof. Using this 
iemma, the main result will be established by purely number-theoretical 
methods, 


LemMA 2. If my tends to (as n—> ©) more rapidly than any fixed 


* Loc, cit. 2, where statistical independence of arithmetical functions is defined 
and discussed. See also P. Hartman, E. R. van Kampen and A. Wintner, American 
Journal of Mathematics, vol. 61 (1939), pp. 477-486. 

°Cf. for instance the first chapter of S. Bernstein’s paper, “Sur l’extension du 
théoréme limite du calcul des probabilités aux sommes de quantités dépendantes,” Mathe- 
matische Annalen, vol. 97, pp. 1-59. See also M. Kac and H. Steinhaus, “Sur les 
fonctions indépendantes II,” Studia Math., vol. 6 (1936), pp. 59-66. 


740 P. ERDOS AND M. KAC. 


power of s,, then the number of integers from 1 up to my which are not 
dwisible by any prime iess than s, is equal to 


Mn 
t+ 
lg sp lg Sn 


where C denotes Euler’s constant. 


The proof of this statement is implicitly contained in the reasoning of 
V. Brun on page 21 of his famous memoir “ Le crible d’Erasosthéne et le 
théoreme de Goldbach ”* and may therefore be omitted. 

Let ¢(n) represent a function which tends, as n—> o, to 0 in such a 
way that n?° — The function will be denoted by a, and nVotn) 
by Bn. Let a,(),a2("),- + - be the integers whose prime factors are all less 
than a,. and let w(m: n) be the greatest a; which divides m. We then have 
the following 


Lemma 3. The number of integers m Sn for which n) =a;(n), 
where S Ba ts equal to 


eC 


ai(n)d(n) Gora ax): 


This is a direct consequence of Lemma 2. For consider all those integers S n 


which are of the form r-aj,(n) and such that r is not divisible by any prime 
<4,. Evidently, the integers thus defined are all the integers = n for which 
v(m; n) =a;(n). Their number is equal to the number of integers r which 
are Sn/a;(n) and not divisible by any prime <@,. The restriction 
aj(n) < Bn makes n/aj(n) tend to © more rapidly than any power of %& 
and therefore Lemma 2 can be applied (put m,—=n/a;j(n) and sy, = 4). 
This completes the proof. 


Lema 4. The number y of integers = M divisible by an a;(n) > Bn 
is less than bM Vo(n), where b is an absolute constant. (It follows from this 
that the density of the integers which are divisible by an ai(n) > Bn is less 
than bV/ o(n).) 


We have 


M = 
Il y(m;n) = I p= [M/p"] < Il 
D<an D<an 
and since 
lg = 2M & p~2M¢(n) Ign 


D<an D< an 


7 Skrifter Videns, Kristiania, 1920. 


on 


He 


Di 


the 


the 


i 

i 

| = 

for 
(i) 
Th 
80 

fj 

By 
tha 
(ii 

wh 
sat 
int 
if 
| Co 
, asi 
| as 
(ii 
anc 
q 
(iv 
ik Fu 
it 

q 

ft 


THE GAUSSIAN LAW OF ERRORS. 741 


one has 
M 
m=1 
Hence, finally 


. 4. Lemma 5. Denote by In the number of integers from 1 up to n 
for which 


(i) fa,,(m) 4 Ag, + woV 2 Ba, 
Then 
Lim D(o). 


Divide the integers from 1 up to n which satisfy (i) into classes £,, 2.,- + - 
so that m belongs to #; if and only if y(m; n) =aj(n) ; and denote by | F; | 
the number of integers in #;. One obviously has 


3 | BI. 
ai>Bn 


By Lemma 4. & | Fi | < bnV d(n) and therefore it is sufficient to prove 
ai >Bn 
that n=? & |B; |—D(o) asn—o. On the other hand by Lemma 3 
aiSBn 


where the dash in the summation indicates that it is extended over the a;’s 
satisfying fa,(ai) < Aa, + oV2 Ba, In order to evaluate 3’, divide all the 
integers into classes /’,, F.,- + - having the property that m belongs to Fy 
if and only if w(m; n) =ai(n) and let {F;} denote the density of Fj. 
Consider now the set 3’F';, where the dash in summation has the same meaning 
asabove. By putting ] = a, and using Lemma 1 we have that {3’F';} — D(w) 
asn—> oc or {3’F;} = D(w) + 0(1). Now 

(iii) = ¥ 4. 

“+, ai>Bn 


and by Lemma 4 


(iv) { ¥ 

ai>Bn 
Furthermore there is only a finite number of a;’s which are less than B, and 
therefore { 3’ Fi}—= {Fi}. But 


ai<Pn ai< Bn 


ai(n) g(n) lan \O(n) Ign 


742 P. ERDOS AND M. KAC. 
and this implies that 


Bn o(n) Ign b(n) len} “i(™) 


Finally (iii), (iv) and (v) give 


ee 1 , 1 | 


The combination of this formula with (ii) completes the proof of our Lemma. 


5. We now come to the proof of the main theorem. Notice first 
that for mn, | f(m) —fa,(m)| <1/¢(n). In fact, | f(p)| S1 implies 
that | f(m) —fa,(m)| is less than the number of those prime divisors of m 
which are = a. This number is obviously < 1/$(n), since (a)1/?™ =n, 
Notice furthermore that | f(p)| = 1 and the well known results concerning 
the sum & p? imply that | 4,—Aa,|<—C,lg¢(n) and | B,— B,, | 

Don 
< — 6(n), where C, and are absolute constants. 


Now choose ¢(n) so that 1/¢(n) =0(B,). Evidently every mSn 
satisfying the inequality f(m) < An + wV 2B, also satisfies, for sufficiently 
large n, the inequality fg,(m) < Aa, + (w+) V2 Ba, In addition every 
m<=n satisfying f,,(m) < Ag, + V2 Ba, satisfies, for sufficiently 
large n, the inequality f(m) <An-++-wV2 By. Hence, by Lemma 5, 


—e) Slim inf sup = D(w+e); 


and this proves the theorem, since « > 0 is arbitrary. 


6. The theorem mentioned in the introduction is obviously a particular 
case of our main theorem. It corresponds to the case f(y) =1. Because of 
the large number of applications of v(m) it is of special interest. It should 
be mentioned that the assumption f(p*) —f(p) can be removed; also 
| f(p)| S 1 may be replaced by a much weaker condition. This however, would 
complicate the statement of the main theorem 

We may perhaps point out that Lemma 2 (Brun) is the “ deepest ” part 
of the proof and that the “ statistical ” part is relatively superficial. However, 
the statistical considerations seemed to be suggestive and fruitful in leading to 


new and perhaps striking results. 


INSTITUTE FOR ADVANCED STUDY AND CORNELL UNIVERSITY. 


di 


ON THE STANDARD DEVIATIONS OF ADDITIVE 
ARITHMETICAL FUNCTIONS.* 


By Hartman and AurEL WINTNER. 


1. If p= p, denotes the k-th prime number and 7 a positive integer, 
then, on starting with any double sequence of real numbers ax, put, for every 


positive integer n, 


fo. 
(1) =3 lim f;,(n), 
k=1 k-00 
where 
k 
(2) je(n) =3 fi(n), 
j= 
and 


§ 0, if (mod px), 
(3) fx(n) = 1) if ml 
(pe), if pet and 
finally f(p,') =a, It is clear that the functions f(n), thus obtained, and 
only these functions, are additive, i. e., such that 
(4) f(nmyne) =f(n.) + whenever =1; (f(1) =9). 
In fact, the series (1) is convergent for every choice of the double sequence 
{{au.}}, since the series has, for every fixed n, at most a finite number of 
non-vanishing terms. It is also clear that an additive function f and either 
of the two sequences of additive functions {f;}, {f.} of n determine each other 
uniquely. The additive functions to be considered will be assumed to be 
real-valued, 
For a given y= f(n), define y* = f*t(n) by placing 
(5) y* =y or y*=1 according as |y| <1 or |y| 21. 
Then the question as to the existence of an asymptotic distribution function 
of an f may be answered as follows: * 
(I) An additive f(n) has an asymptolic distribution function o(2x), 


—x<cr<c+ oo, if and only if both series 
ft(p)* 
(6;) (62) 
p p 
are convergent. 
* Received April 8, 1940. 
1P, Erdés and A. Wintner, “ Additive arithmetical functions and statistical in- 
dependence,” American Journal of Mathematics, vol. 61 (1939), pp. 713-721. 


743 


744 PHILIP HARTMAN AND AUREL WINTNER. 


It is clear that if f;, is an additive function depending oniy on the k-th 


prime number (i. e., if it is of the type (3) ), then it always has the asymptotic 


distribution function 


(7) = > 0, (p= m=0,1,-- -). 
P—1 <a p™ 


Furthermore, it is easy to see that the terms fi(n) of an additive function (2), 
which depends on a finite number of prime numbers, are statistically inde- 
pendent; so that, in particular, the additive function fr always has an asymp- 
totic distribution function 6.(4),— « <«< #, and the latter is represented 
by the convolution 


It was shown loc. cit.’ that the infinite sum (1) cannot have an asymptotic 

distribution function o(a) unless it has the asymptotic distribution which one 
would formally expect in virtue of (1) and (8), i.e., that (I) may be replaced 
by the following theorem: 
An additive f(n) has an asymptotic distribution function o(2), 
—a<cr<c+ o, if and only if the infinile convolution o, * 0. *- + - of the 
asymptotic distribution functions (7%) of its terms (3) is convergent, in which 
case one necessarily has 


(9) o * ** 
2. For the more restricted class of almost periodic functions (B*), the 
following theorem was recently established : * 
(II) An additive f(n) is almost periodic (B*) if and only tf both serves 
(10,) HP), (102) 


are convergent. 

Since this theorem is analogous to (1), there arises the question whether 
or not it is possible to replace (II) by a criterion (II’) which relates to it as 
(1’) does to (1). We shall prove that such is actually the case: 

(II’) An additive f(n) is almost periodic (B) if and only if its a 
distribution function o(x) possesses a second moment 
+00 


(11) f < @; 


-90 


2P, Erdés and A. Wintner, “ Additive functions and almost periodicity (B*),” 
American Journal of Mathematics, vol. 62 (1940), pp. 635-645. 


th 


STANDARD DEVIATIONS OF ADDITIVE ARITHMETICAL FUNCTIONS. 745 


in which case one necessarily has 
+00 


(12) 


-0O 


It is understood that M{g} denotes the mean 


(13) M{g} lim ~ (g(1) +9(2) 


if this limit exists. Incidentally, it will be clear from the proof of (II’) that 
the condition (11) of almost periodicity (B*) is satisfied if and only if f has 
an asymptotic distribution function and is such that 


(14) M {f?} < 


where M{g} denotes the upper mean 


(15) =limsup (y(1) + 9(2) if g 20. 


2bis. It is very striking that the moment criterion (11) of (II’) can 
insure almost periodicity (B*). In fact, there will be given at the end of 
§4bis an example of a series of independent almost periodic (B*) func- 
tions, with the property that the series is convergent everywhere to a limit 
function for which the square mean is infinite, although this function possesses 
an asymptotic distribution which is represented by the corresponding infinite 
convolution and which has a finite second moment. This example shows that 
the possibility of replacing (II) by (II’) depends essentially on the properties 
of prime numbers, and not merely on the statistical independence of the 
terms of (1). 

Similar remarks hold for the equivalence of (1) and (1’). 


3. For a given distribution function p(x), ow, and fora 


given positive integer 7, let pi(p)denote the i-th moment 
+00 +00 
(16) pi(p) = vidp(x) (if f | |*dp(x) < 
-00 -0O 

Thus, p has a finite standard deviation if and only if p2(p) < 3 in which 
case the square of the standard deviation of p is 
(17) v(p) = po(p) — (p)? = 0. 

Before proving (II’), it will be convenient to establish the following 
theorem : 


4 


746 PHILIP HARTMAN AND AUREL WINTNER. 


(11*) An additive f(n) is almost periodic (B?) if and only if the asymptotic 
distribution functions (7) of its terms (3) are such as to make both numerical 


series 
fo 
(18,) (ox) (182) = 
k=1 


convergent; in which case (18,), (182) necessarily represent ps(o), v(o), 
respectively, where o 1s the asymptotic distribution function (9) of f(n). 


Notice that the convergence of (182) implies that v(ox) < ~, i.e. 
po (ox) < for every k. 

For a given distribution function p(z), +o, let p’, p” 
denote the non-negative numbers 


(19) p=p(—1), p’=1— 
and put 
0, 
p(x) —p’, if —l<2¢S0, 
i, 
(so that p(x), 0 <4 < + %, obviously is a distribution function). Then 


one can express the theorem which relates to (I’) in the same way as (II*) 
does to the equivalent formulation (II’) of (II), as follows: 


(I*) An additive f(n) has an asymptotic distribution function if and only 
if the asymptotic distribution functions (7%) of tts terms (3) are such as to 
make the three numerical series 


(21) (on + on"), 
k=1 k=1 k=1 
convergent, 
In fact, it is known * that an infinite convolution o, * + is con- 


vergent if and only if the three numerical series (21) are convergent. Hence, 
(1*) follows from (I’). 


4, In order to avoid an interruption of the proofs, there will first be 
proved a relation between the existence of time averages and space averages 
of an arbitrary function g(t), 0S ¢ < o, which is almost periodic (B*) for 
some fixed X=1. The case of number-theoretical functions f(n) may be 


2B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta func 
tion,” Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-85; 
Theorem 34. 


STANDARD DEVIATIONS OF ADDITIVE ARITHMETICAL FUNCTIONS. TAT 


thought of as the particular case in which g(t) =f(n) for n—1l<tSn, 
where n= 1,2,--- (and f(0) say). As will be seen in § 4 bis, the 
theorem to be obtained is not obvious in itself. In fact, it is to the effect that, 
in a case of almost periodicity, the general inequality of Fatou becomes an 
equality. 

THEOREM. If o(x), —x <u<-+ denotes the asymptotic distribu- 
tion function of a real-valued function g(t), 0St< «, which is almost 
periodic (B) for a fixed X= 1, then 


+00 
for every positive exponent pS (which implies that 
+00 
(23) M{g*} = f (x) 
-00 


for every positive p= X, if p is an integer). 

Proof. For a fixed T >0, let or(x), <x<-+ om, denote the 
distribution function of the function (of class Z‘) which is equal to g(t) for 
0<t< T and has the period 7. Then, by the definition of the asymptotic 


distribution function of g(t), 
(24) Tox 


holds at every continuity point x of o; while obviously 


+00 
f |  |\dor (a) | g(t) |Adt. 
0 


Since, by Fatou’s inequality, 


+00 +00 
| lim op 3S lim inf | |\dor(z), 
T>0o 
it follows that 
+00 


On the other hand, it is known‘ that if gy(t), 0S t < ~, denotes, for 


a fixed positive number Y, the function which is equal to —X, g(t) or X 


* A, S. Besicovitch, Almost Periodic Functions, Cambridge, 1932, p. 100. 


748 PHILIP HARTMAN AND AUREL WINTNER. 


according as f(t) <—X, | g(t)| =X, or g(t) >X, then gy is almost 
periodic (B*) and one can find an XY, > 0 such that 


(26) if 
In view of (25), one can choose Y; so large that 
-X +00 
(27) f | a |\do(x) + f |aPdo(z) 
-00 


Since o is monotone, one can assume that 2 = + X, are continuity points of 
o(7). Then it is clear from (24) that if e > 0 is given and X denotes X,, 
one can choose a positive T = 7'(X,,«) =T, so large that 


(28) | or(x) for r= +X, (X = X,). 
In addition, 7 = T, may be so chosen so large that 
4 4 
(29) | f | |\dor (x) —f |\do(x)| (X == X,) ; 
<< 


(this is clear from (24) and from Helly’s term-by-term integration theorem, 
since 1 =X, is fixed). Furthermore, (26) assures that, since ¥ = X, is 
fixed, one can choose T = T;, so large that 

T 


(30) as | g(t) —gx(t) |Adt (X X,). 


0 


Finally, since M{| g |} exists (< «), one has 


0 


if 7 = T, is chosen sufficiently large. 
Since the definitions of o7(a) and gx(¢) obviously imply that 


T 
J | X) +1 —or(X)], 
-X 


0 


it is clear from (27), (29) and (28) that 


oo 


T 
x |\do(2) f | gx(t) [dt | < 4e + XMo(—X) +1—o(X)]; 


{ 
] 
I 
t 
§ 
a 


STANDARD DEVIATIONS OF ADDITIVE ARITHMETICAL FUNCTIONS. 749 


so that, since X*{o(— X) + 1—o(X)] is certainly not larger than the sum 
of the two integrals on the left of (27), one has 


+00 
where 


r T 
in J | gx(t) 


Since it is seen from (30) and from Holder’s inequality that this 7 is majorized 
by a function of « which tends to 0 as e—> 0, it follows from (31) and (32) 
that (22) is true for p—A. 

It is clear from this proof of (22) for »—=A, that (22) holds a fortiori 
for 0 << p< Ad, and that (23) is valid for every positive integer » not greater 
than A; so that the proof of the Theorem is complete. 


4bis. That in the Theorem, just proved, the assumption of almost 
periodicity cannot be replaced by a mere average assumption, is shown by the 
following example: 

In terms of a sequence of non-negative numbers 4, d@2,° °°, define a 
function g(t), 0S < by placing g(t) = 0 for every ¢ not contained in 
any of the intervals nt << n-+-n7, where n =1,2,---, and g(t) =a, if 
t is in the n-th of these intervals. Clearly, the asymptotic distribution func- 
tion, o(x), of g(t) exists for every choice of the sequence {d,}; in fact, 
a(x) =4(1+sgnz), so that the Stieltjes integral on the right of (16) 
exists and vanishes for every value of the exponent ». On the other hand, 
one can choose the sequence {a,} so that (i) there exists a finite non-vanishing 
mean value M{g?}; (ii) the mean value M{g?} =-+ ~; (ili) there does not 
exist a mean value M{g?} =-+ o. In order to see this, it is sufficient to 
choose 


n, if n 2"; 
(i) dn =n; (ii) =n; (iii) = 0, 2", 


In all three cases, (22) fails to hold for » = 2, although the integral on the 
right exists for p = 2. 

In order to obtain a series of the type mentioned in § 2 bis, it is sufficient 
to consider the function g(t), Ot < , defined by the convergent series 
gi(t) + go(t) +, where gn(t) has, for a fixed n, the value a, or 0 
according as ¢ is or is not in the intervaln St <n+n". 


750 PHILIP HARTMAN AND AUREL WINTNER. 


5. Besides the theorem proved in § 4, a parallel theorem on infinite 
convolutions will be needed. 


Lemma. If an infinite convolution o, * *- converges to a distribu- 
tion function o for which po(a) < «, then the series (18,), (182) are con- 
vergent and represent p,(o), v(o), respectively. 


The proof of this Lemma will depend on the following known criterion: 

If a sequence of distribution functions o,, 02,- * - is such as to make the 
series (18.) convergent (so that, in particular, po(ox) < o for k =1,2,- - -), 
then the infinite convolution o, * o2 *- - - is convergent if and only if the series 
(18,) is convergent; in which case the distribution function = o; * *: 
has a second moment p2(o0) < «, and the series (18,), (182) represent (co), 
v(o), respectively. 

It is easily verified from the definition of the convolution « * B of two 
distribution functions a(zr), B= B(x), that 


(33) po(a*B) < if and only if po(a) + po(B) < o. 
Furthermore, 
(34) if < then v(a*B) + 


The Lemma may now be proved as follows: 
Suppose that p2(v) < o holds for a given convergent infinite convolution 
o=0,*o,*:-+. Then repeated application of (33) and (34) shows that 


holds for every k. In fact, it is known ® that if o =o, * is a con- 
vergent infinite convolution, then the infinite convolution ox41 * 
where & is arbitrarily fixed, converges to a distribution function, and that the 
convolution of this distribution function and of o, * 18 a. 
Since (35), (17) and the assumption p.(o) < « imply the convergence 
of the series (18.), the criterion quoted immediately after the Lemma shows 


that the proof of the Lemma is complete. 


6. Next, it will be shown that, in virtue of the representation (7) of 


5 This criterion is implied by the proof, though not the wording, of Theorems 4 
and 5 in the paper of B. Jessen and A. Wintner, loc. cit.*, pp. 56-58. The method 
applied there is that of the Fourier-Stieltjes transforms. For the above wording and 
for a proof which does not make use of Fourier-Stieltjes transforms, cf. E, R. vat 
Kampen, “Infinite product measures and infinite convolutions,” American Journal of 
Mathematics, vol. 62 (1940), pp. 417-448; more particularly, (9) on p. 442. 

°B., Jessen and A, Wintner, loc. cit.*, Theorem 2. 


| 
| 
i 


STANDARD DEVIATIONS OF ADDITIVE ARITHMETICAL FUNCTIONS. 751 


the asymptotic distribution functions o,(%), of the terms 
fi(m), fo(m),* * + of an arbitrary additive f(n), the two series (10,), (102) 
are convergent if and only if the two series (18,), (182) are convergent. 

It is clear from (7) and (16) that if & is fixed and p= p, denotes the 
k-th prime number, then 


(36;) (ox) o— 1 = pt 


provided that the series (36,), (362) are absolutely convergent, where it is 


(362) p! 


understood that if the non-negative series (36,) is divergent, then p2(o,)—= ©. 
Furthermore, it is clear from the Schwarz inequality, 


© F (pt)? 1 
37 > = > wher => 
that the absolute convergence of the series (36,) is implied by the convergence 
of the series (36). Consequently, both relations (36,), (36,) hold for a 
fixed & whenever pe(ox) << «©. And (16), (17) show that p2(ox) << is 
equivalent to v(ox) < « (a condition which is certainly satisfied for every k& 


whenever (18.) is a convergent series). Suppose that v(ox) < o for every k. 
It is clear from (36,), (362) and (37) that 
(oe)? 5 80 that v(on) = 
(p—1)? 
by the definition (17) of ». On substituting p.(ox) from (362) into the last 
inequality, one obtains 


{= 


(p—1)* 
On the other hand, since (17) implies that v(ox) S pe2(ox), one sees from 
that 


(89) v(ox) S ApS ey’ where A, = P 
p 

The relations (38) and (39) imply that either both series (102), (182) 
are convergent or both are divergent. It follows that in order to complete the 
proof of the last italicized statement, it is sufficient to prove that’ if the series 
(102), (18) are convergent, then either both series (10,), (18;) are con- 
vergent or both are divergent. 

To this end, notice first that, since ab S a* + Db’, 


l 
p 1=2 p i=2 


752 PHILIP HARTMAN AND AUREL WINTNER. 


But the first double series on the right of (40) is majorized by the series (10.), 
which is supposed to be convergent; so that, since 


1 


the double series on the left of (40) is convergent. It follows, therefore, from 
(36,) that the series (18,) is convergent if and only if the series 


is convergent. 

Hence, the statement that either both series (18,), (10,) are convergent 
or both are divergent is equivalent to the statement that either both series (41), 
(10,) are convergent or both are divergent. Since the difference of the series 
(41), (10,) is 

f(p) 

> 
it follows that all that remains to be shown is that the series (42) is con- 
vergent if either (41) or (10,) is a convergent series. But both {p*} and 
{(p—1)-*} are monotone and bounded sequences of numbers (which tend, 
in fact, to 0). Hence, it is seen from a standard convergence criterion (partial 
summation), that the convergence of either of the series (41), (10,) implies 
the convergence of the series (42). 

This completes the proof of the last italicized statement. 


7. The proofs of (II*), § 3 and (II’), § 2 are now immediate. In fact, 
it is clear from the Lemma of § 5 and the Theorem of § 4 (where A = 2 =p), 
that, because of (I’), § 1 and the italicized result of § 6, both (II*), $3 and 
(II’), § 2 are equivalent to (II), § 2. 


QUEENS COLLEGE, 
Tue JOHNS HOPKINS UNIVERSITY. 


p p(p—1) 
« 
d 


ON THE ALMOST PERIODICITY OF ADDITIVE NUMBER. 
THEORETICAL FUNCTIONS.* 


By PHILtp HARTMAN and AUREL WINTNER. 


1. By an additive function f = f(n) is meant a sequence f(1), 
defined for every positive integer n in such a way that 


(1) f(mine) =f(m) +f(m2) whenever (f(1) =0). 


Thus, if p= p, denotes the k-th prime number and IJ is a positive integer, 
the correspondence dix =f (px') establishes a one-to-one correspondence be- 
tween arbitrary additive functions f and arbitrary double sequences of numbers 
{{au}}. With every additive function f(”), there is associated the sequence 
fi(n), fe(m),° of additive functions, where the double sequence { {fj (px’}} 
of f;(n) is defined as follows: 


pe!) if ke 1,2,-°-,j, 


(?) = if k>j, (1=1,2,- °°). 


It is known? that the real additive function f(n) has an asymptotic dis- 
tribution function if and only if both series 


are convergent, where y* = f* is defined by placing 
(4) y* =y or y*=1 according as |y| <1 or |y|21. 


It is also known? that an additive function f(n) is almost periodic 
(B) if and only if both series 


(51) (52) p =i 
are convergent. 

By a suitable modification of the proof of the latter theorem, it will be 
shown in the present paper that an additive function f(n) is almost periodic 


(B) if and only if the four series 


* Received April 8, 1940. 

1P, Erdés and A, Wintner, “ Additive arithmetical functions and statistical in- 
dependence,” American Journal of Mathematics, vol. 61 (1939), pp. 713-721. 

?P, Erdés and A. Wintner, “ Additive functions and almost periodicity (B?), 
American Journal of Mathematics, vol. 62 (1940), pp. 635-645. 


” 


153 


D 


Vd4 PHILIP HARTMAN AND AUREL WINTNER. 


are convergent, 

Notice that the exterior summation index runs in (52) and (63) from 
and = 2, respectively. 

If f—f;-+ fu, where fi, fu are real, then f is additive and almost 
periodic (B) if and only if so are both functions f;, fu. Furthermore, since 
(4) implies that 


(fu)? S2 + ifn), 


the 4 series (6,), (62), (63), (6s) are convergent for f =f; + tf1 if and only 
if so are the 4 + 4 series which one obtains by writing f; and fy for f. Hence, 
it is sufficient to prove the italicized theorem for the case of a real-valued f. 
This restriction will always be assumed. 


2. In order to prove first the sufficiency of the criterion of the italicized 
theorem, suppose that the four series (6,)—(6,) are convergent. 
Let F(n) be the additive function for which the ‘double sequence 


{{F }} is given by 


if _ 
(7) if | f(p)| <1, 


Since the proof given loc. cit.2 (beginning of § 5) for 


(oe) 
I=1 p P 


on the assumption of the convergence of the series (52) actually uses the 
convergence (6;) and (6,4) only, (8) is satisfied. Hence, 


plm p' 


Since obviously 


m=1 l=ip>jP 
for every j, it follows from (9) that 
(10) limsup— |F(p')| 30 as joo. 
n>00 m=1 pl|m 


p>Jj 


But if F;(n) denotes the function which belongs to F(n) in the same way 


y 


a 
I 
i 
| 
( 
( 
| 
( 
| 
f 
| 
H 
| 


ALMOST PERIODICITY OF ADDITIVE NUMBER-THEORETICAL FUNCTIONS. 755 


as the function f;(”), which is defined by (2), belongs to f(m), then, since 
F and Ff’; are additive functions of n, 


N m= m=1 pim 
Hence, (10) implies that 
(11) M{| as joo, 


where M{g} denotes the upper mean value 
1” 
M{g} =lim sup= 3 g(m) 
noo m=1 


of a non-negative function g of n. 
Let G(n) denote the additive function 


(12) G=f—F. 
Then (7) implies that, for every prime p, 
0 if | f(p)| 21; 
13 G(p) = ) pA 
so that 
p Ifml<1 P 


Since the series on the right is, in view of (4), majorized by the series (62), 
which is supposed to be convergent, it follows that 


(14) > 


On the other hand, it is clear from (13) and from the convergence of the 
series (6,) and (6,), that 


(15) is convergent, 
p 
since 
(fg 


But (14) and (15) imply, as shown loc. cil.*, § 8-§ 8 bis, that 
M {| G— Gj |?} 30 as j> @, 


where G;(n) denotes the additive function which belongs to G(n) in the same 
way as the function f;(”), which is defined by (2), belongs to f(m). 
Since (12) obviously implies that G; =f; — Fj, it is clear that 


2 
p 


PHILIP HARTMAN AND AUREL WINTNER. 


M{|f—fi |} S M{| F—F; |} + |}; 
it follows therefore from (11), (16) and from an application of the Schwarz 
inequality to M{| G— G; |}, that 
(17) M{|f—f;|}30 as joo. 


Finally, it is known * that if an additive function g(n) is, with reference 
to a fixed prime number p = pj, such that 


(18) g(n) =0 whenever p;fn, 
then g(n) is almost periodic (B) if and only if 
00 
P 
But it is clear from the definition (2) of the additive functions f; of n, that 
the additive function f of n which is defined for j = 1,2,-- - by 


is such that (18) is satisfied by gy =f‘). Furthermore, it is seen from the 
definitions (2), (20) of the additive function f'/ of n, that the convergence 
of the series (6,) implies that 


LA 
P 


This means that (19) is satisfied by g = f‘/) for every 7. Consequently, f'? is 
almost periodic (B). Since (20) implies that 


fp + fO f%, 


and since the functions which are almost periodic (B) form a linear space, 
it follows that the additive function f; of n is almost periodic (B) for every j. 
Hence, it is clear from (17) that the proof for the almost periodicity (B) 
of f(n) is now complete. 


< o for every p= pj. 


3. This proves that the convergence of the four series is a sufficient 
condition for the almost periodicity (B) of f. In order to prove the necessity 
of this condition, suppose that f(n) is a given real additive function which 
is almost periodic (B). 

Since f(n) then has an asymptotic distribution function, both series 


?—. R. van Kampen and A, Wintner, “On the almost periodic behavior of multi- 
plicative number-theoretica] functions,’ American Journal of Mathematics, vol. 62 
(1940), pp. 613-626, Theorem II, \= 1. 


756 
( 
I 

( 
| 

( 
t 
I 
0 
| ic 
t] 


ALMOST PERIODICITY OF ADDITIVE NUMBER-THEORETICAL FUNCTIONS. 7%57 


(3;), (382) are convergent. And, in view of (4), the convergence of (3,) 
implies that 


(21) 
P 
In terms of the given f(n), define an additive function D(n) by placing 
(22) D=f—H, 


where H = H(n) denotes that additive function for which the double sequence 
{{H(px')}} is given by 
f(p'), if 
(23) H(p') = <f(p) if 11 and | f(p)| 21, 
0 if J=—1 and | f(p)| <1. 
Accordingly, 
0, if 71, 
D(p*) == <0, if and | f(p)|21 
f(p), if and | f(p)| <1, 
and so it is clear from (21) and from the convergence of the series (3,) and 
(3.), that one obtains four convergent series by writing D for f in (6,)—(6,). 
It follows, therefore, from the result proved in § 2, that D(n) is almost 
periodic (B). Since f is almost periodic (B) by assumption, one sees from 
(22) that H is almost periodic (B). In particular, 


(24) |} < 


But (21) and (24) imply, after an obvious adaptation of the estimates carried 
out loc. cit.2, § 11, that 


| 
i=l p 


this series (25) being the analogue of the last series loc. cit.?, §11, in case 
the almost periodic class (B*) is replaced by (B). 

It is now easy to prove the convergence of the four series (6,)—(64). 
In fact, it is clear from (23) that (25) may be written in the form 


| f(p)!21 l=2 p p! 


Furthermore, both series (3,), (32) are convergent in view of the existence 
of the asymptotic distribution function of the function f. Since (32) is 
identical with (6.), while (26) implies the convergence of (6;) and (64), 
the convergence of the three series (6,), (62), (6,) follows. Finally, since 


(21) and (26) imply the absolute convergence of the series 


PHILIP HARTMAN AND AUREL WINTNER. 


an 
| f(p)|21 P lf(p)|21 


respectively, the convergence of (6,) follows from the convergence of (3,) 
and from the fact that, in view of (4), the series (6,) may formally be 
written as the sum of the three series (3,), (271), (272). 


4. <A careful perusal of the proofs, applied loc. cit.? and above, shows 
that, by standard applications of the inequalities of Holder and Minkowski, 
one can generalize the criteria (5,)—-(52) and (6,)—(64) of the respective 
almost periodic classes (B*) and (B) = (B') as follows: An additive func- 
tion f(n) is almost periodic (B*) for a fired X=1 if and only if the four 
series 


are convergent. (Correspondingly, the convergence of (283), (282), (283) 
is equivalent to the convergence of the single series (52), if A = 2.) 

A consequence of the italicized theorem is that if the double sequence 
{{f(px') }} of an additive function f(n) is bounded, then f(m) either is almost 
periodic (B*) for arbitrarily large A or is not even almost periodic (B) = (B'). 
This is not obvious in itself, since f(m) is not in general a bounded function 
when its double sequence is bounded. 


QUEENS COLLEGE, 
THE JOHNS HOPKINS UNIVERSITY. 


758 


ON THE SPHERICAL APPROACH TO THE NORMAL 
DISTRIBUTION LAW.* 


3y HartMAN and AUREL WINTNER. 


diameter of this sphere. 


there results, as » — «©, a symmetric normal distribution in both cases. 


* Received April 15, 1940. 


case of the “Abrundungsfehler ” is considered. 


[4],, pp. 90-93; and Borel and Deltheil [5], pp. 134-136. 
* Lévy [12]; also Lévy [13]. 
*Cf., e.g., Wiener [18], pp. 135-143. 


recently rediscovered by Schoenberg (e.g., Schoenberg [15], Lemma 4); cf. 
Blumenthal [1]. 
°Cf., e.g., Jessen and Wintner [9], p. 55; Wintner [21], p. 76. 


Introduction. There are two classical “‘ geometrical ” approaches to the 
normal distribution law. One of these is represented by the theory of the 
addition of independent random variables or, equivalently, by the theory of 
convolutions. ‘This approach, followed in a general and precise manner by 
P. Lévy and his followers, is based on the consideration of a product distribu- 
tion on an n-dimensional cube, a distribution which is then projected: orthogo- 
nally on the principal diagonal of this cube.’ The other approach, which is 
“wi due to Boltzmann and is reproduced in some of Borel’s elementary text-books 
on the calculus of probability, has as its starting point, not the theory of in- 
dependent random variables, but rather the simplest model of the Maxwell 
theory of velocity distributions.2 This approach, which plays a fundamental 
role in the investigations of P. Lévy * and N. Wiener * in functional analysis, 
is based on the consideration of the equidistribution on the surface of an n- 
dimensional sphere, a distribution which is then projected orthogonally on a 


If the unit of length is increased in the proportion 1: Ym in case of the 
first approach, and decreased in the same proportion in the second approach, 


It is known ® that the Fourier-Stieltjes transform of the equidistribution 
on the surface of an n-dimensional sphere of radius 7 is the Bessel function 
where J*,(z) = 2Jv(z), and that this Bessel func- 
tion is also the Fourier-Stieltjes transform of the 1-dimensional distribution 
which represents the projection on a diameter. In fact,® any 1-dimensional 


*Cf., e.g., Lévy [11]. As to the geometrical interpretation of the convolution 
process by means of orthogonal projections, cf. Sommerfeld [17], where the simplest 


*Cf. Boltzmann [2], vol. 2, pp. 96-100 and, e.g., Borel [3], pp. 44-50; also Borel 


°Cf., e.g., Wintner [20], p. 313, where references are given to the principle of 
Huyghens; Jessen and Wintner [9], p. 59; Wintner [21]. Some of these things were 


G59 


760 PHILIP HARTMAN AND AUREL WINTNER. 


distribution which represents the projection of an n-dimensional distribution 
function of radial symmetry (i.e., a distribution which is built up by means 
of an arbitrary Stieltjes weight factor from spherical equidistributions be- 
longing to varying r) has the same Fourier-Stieltjes transform as the projected 
distribution. 

In view of the multi-dimensional ’ analogue of Lévy’s inversion formula 
of Fourier-Stieltjes transforms, this spherically stratified decomposition of an 
arbitrary distribution function of radial symmetry into spherical equidistribu- 
tions is known to be equivalent to the Cauchy-Poisson formula for spherical 
waves in n-dimensions.® 

The results of the present paper concern certain questions connected 
partly with the above topics and partly with a problem® suggested by the 
most primitive approach to Maxwell’s law of velocity distribution. If 
8(vz, Vy, Vz) denotes the density of probability at the point v = (vz, Vy, vz) 
of the velocity space, then Maxwell’s assumptions imply that 8 is a function 
of |v | = (v.? + v,? + v.7)# alone, that the probability densites of each of the 
velocity components vz, Vy, vz depend on the respective components alone, and 
that the latter densites are represented, up to adjusting factors of propor- 
tionality, by the same function 8 as the probability density of the speed | v'|. 
In fact, this condition of the preservation of the density, function under 
projections is obviously satisfied if log (| v |) is proportional to | v |*. There 
rises, therefore, the question whether or not this property of preservation of 
the probability density under projection is in itself sufficient to assure that 
8(|v |) defines the Maxwell distribution. The result of § 5 will imply that 
the Maxwell law may be deduced from this functional condition alone. 

§ 3 deals with the class of distribution functions which may be represented 
as stratifications of a given sheaf of distribution functions. The results obtained 
in this section are illustrated in § 4 by their application to the special case of 
stable distribution functions. The simplest and least restricted case of this 
particular case is the one where the underlying stable distribution is normal. 
This limiting case will be separately studied in § 2 by an elementary approach. 

As to this approach, which in § 3 will be extended to the general case, 
a few methodical remarks seem to be of interest. Recently, Schoenberg *° has 
rediscovered the above-mentioned Cauchy-Poisson decomposition and, in par- 
ticular, the fact 14 that the Fourier-Stieltjes transform of the spherical equi- 


7 Haviland [8], I. 
§ Cf., e. g., Wintner [20], pp. 316-319; Jessen and Wintner [9], p. 55; Wintner [21], 
p. 76. 
® For more refined approaches, cf. e. g., Boltzmann [2], vol. 1, chap. 1. 
10 Schoenberg [15], p. 816; cf. Blumenthal [1] 
11 Schoenberg [14], p. 791; cf. Blumenthal [1]. 


3 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 761 


distribution of radius r is J*ini(r|w|)/J*jn1(0). Actually, Schoenberg’s 
considerations ** concern all functions which are Fourier-Stieltjes transforms 
of radically symmetric n-dimensional distributions for every n. Since Borel’s 
approach to the normal distribution law also has escaped Schoenberg, he redis- 
covers *° the spherical approach to the Gaussian law by applying to the Bessel 
functions, mentioned above, the continuity theorem of Fourier-Stieltjes trans- 
forms, instead of proceeding directly as Boltzmann, or Borel and Lévy do." 
And he applies the same method to the spherical stratification formula of 
Cauchy-Poisson. Now, it will be seen in § 2 that the intuitive and elementary 
method may be transferred without much effort to this general case of an 
arbitrary Stieltjes factor of spherical stratification. In particular, the method 
of Fourier-Stieltjes transforms, which is so fundamental in most problems of 
mathematical statistics, turns out to be a ballast in the present case. 


1, Let ¢,(/,) denote a distribution function on the n-dimensional 
Euclidean space F,, that is, @, is a completely additive, non-negative set 
function defined for all Borel sets 1, of R, in such a way that $(f,) = 1. 
It will be supposed that is radially symmetric, i.e., = if 
7’, is the image of any Borel set Z, under an arbitrary rotation of the space 
f, about the origin. For k =1,2,- + -+,n—41, a k-dimensional radially sym- 
metric distribution function ¢(2;.) can be associated. with $,(/,) in the 
following manner: 

Let #2, be a k-dimensional hyperplane through the origin of fy, and A, a 
Borel set on Ry, finally P,, (42) the set of those points in R, whose orthogonal 
projection on R; is in ky. Then a distribution function $x is defined by the 
relation 14 


(1) hi: gn(Pn( Lx) ). 


¢: is called the /:-dimensional projection of $n; in virtue of the radial symmetry 
of d», it is independent of the choice of the hyperplane /;. It is clear from 
this definition that ¢, is also the k-dimensional projection of ¢,;, for 


The set functions ¢,,° - -, x of radial symmetry may be replaced by the 
non-decreasing point functions p;(r),°**,pn(") which are defined for 


0Sr< as follows: 


pi = if r > 0; pi(0) = 0, 


7? Schoenberg [15], pp. 816-821. 

13 Incidentally, Schoenberg’s [15] central formula (2.4), which is due to Laplace, 
may be found on p. 421 of Watson’s Treatise on Bessel Functions. 

14 Cf. Jessen and Wintner [9]. p. 55: Wintner [21], p. 76. 


5) 


762 PHILIP HARTMAN AND AUREL WINTNER. 


where EH,’ denotes the k-dimensional sphere of radius r about the origin of Fy. 
In the case k = 1, the function p,; is usually replaced by the symmetric dis- 


tribution o(z), << +o, 
(2 bis) o(t) 


where £,(x) denotes the half-line (— «,2z) on R,. Obviously, these distribu- 
tion functions px, 0 satisfy the boundary conditions 


(3x) px(0) = 0, 0) —1, (px(r) =0 for <r<0), 
(3bis) o(—w)—0, o(+ 0) —1; 

also, in virtue of the symmetry of ¢,, 

(4) o(—z) =1—o(z). 

In the sequel, the functions py, (where k < n) and o also will be referred to as 


projections of 
It is clear that the relation between pn and px is given by the formula 


= pa(r) + Bt f [ f (cos 6)*+(sin > 0, 
r are cos r/t 
where 
and 
(7) Aj= ~ (2x) 472, 
0 


since the integral B,* } (cos 6)**(sin 0)"*"d6 in (5nx) is that portion of the 


(n —1)-dimensional area of the boundary of the sphere of radius ¢t (> 7), 
whose projection on the hyperplane #; is on the sphere E,”. The relation 
between o and pn is given by 


(8) o(z) where x>0; ef. (4). 


The formula (5.x) may be rewritten by introducing the distribution 


function 
1/2 


(9) Unk (1) = By* f (cos 6)*-1(sin if OS r=1; 


Yu(r) =1, if r> 1, 


(a function which obviously bears the same relationship to the k-dimensional 
projection of the n-dimensional spherical equidistribution of radius 1, as the 
function (2;) does to The formula (5,;-) then becomes 


16 Cf., e.g., Borel and Deltheil [5], p. 135 and p. 187. 


if 


; 
| 
arc cos r 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 


In view of (2;), the formula (10) represents, in terms of the arbitrary 
Stieltjes weight factor pn(t), the stratified decomposition of an arbitrary 
n-dimensional distribution function of radial symmetry into equidistributions 
on surfaces of spheres of varying radii. 

While it is obvious that the distribution functions px and ¢, determine 
each other uniquely, and that px is uniquely determined by pn, it is not so 
obvious that p» is uniquely determined by p;. That such is the case, never- 
theless, may be seen by considering formula (10) as a convolution on a 
logarithmic scale; the uniqueness of p» then follows from the uniqueness 
theorem of Fourier-Stieltjes transforms. This remark, depending on Fourier- 
Stieltjes transforms, is not used in the sequel. 

For the sake of brevity, a 1-dimensional distribution function which is 
the projection of an n-dimensional radially symmetric distribution function 
for arbitrarily large n, will be called a distribution function of class Q. 


2. On the basis of the elementary geometrical relations collected above, 
it is easy to prove that a distribution function a(x), <4@<+ 18 of 
class Q if and only if there exists a distribution function r(t), —w <t< +o, 
such that 


(11) 7(0) =0, r(+0)=1 
and 
(12) o(a) = “o* (x/t)dr(t), 


where o*(x) is the symmetric normal distribution function, of unit standard 
deviation, i. e., 


(13) o* (2) = f 


In order to prove this, suppose first that o(z) is a distribution function 
of class Q. Then there exists a ¢n(H,) and corresponding functions (2), 
(2:), such that (5n,) and (8) hold for n=1,2,---. If is fixed, these 
relations may be rewritten in the form 

co 
(14) = $+ + Ann f (1 — y?/n)*"*) dy dpn (nit), 
wig 0, 

if one changes the integration variables from 6 to y= n'cos@ and from ¢ 
to n-at, 


764 PHILIP HARTMAN AND AUREL WINTNER., 


According to the selection theorems of Helly,’® there exists a non- 
decreasing function 7(¢) and an increasing sequence {mn} of positive integers, 
such that, asn—> ©, 

(15) pm, > 7(t), 


where the sign — is meant in the sense of theory of monotone functions, 
(i.e., in the sense that one has convergence at every continuity point ¢ of 
the limit function 7). It is clear from (15) that 


(16) pm,(2) >7(+ 0), for all as n> 
In view of the term-by-term integration theorem of Helly,’® it is also 


clear from (15) that 


€ € 
if x > 0 is fixed and e is an arbitrary positive number such that =e is a 
continuity point of 7(¢). On the other hand, since 
(18) Ann (27) +, 
holds in virtue of (7), and since 

holds uniformly for | y | S const., where const. is arbitrary but fixed, one sees, 


by choosing const. = x/e, that 


a/t 
Ayn-* (1 — y?/n)?™*) dy (x/t) — 


0 


holds uniformly fore St < It follows that 
a/t 
(19) Am,Mn74 f (1 — y?/my) dy dpm, (mr2t) 
€ 70 
> [o*(a/t) — 
€ 
holds for every >0 and every « >0 (such that te is a continuity 


point of 7). 
Furthermore, if z > 0 is fixed and ¢ = n-4z, then, by (7), 


x/t ne 
Aan | (1 — y?/n)2("-9) dy = Ann4 f, (1 — y?/n) dy = 4; 
0 0 
so that 


16 Cf., e.g., Wintner [22]. 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 765 


€ 
(1 dpn < — pn (2) 


Hence, 


€ a/t 
(20) lim sup f J (1 y?/ Mn) dy dpn(my3t) 


° < 4[r(c) 0)]. 
Finally, from (13), 


(21) f [o* (a/t) —}]dr(t) =2f = 
+0 +0 
The relations (14), (16), (19), (20), (21) obviously imply 


(22) o(7) = $4 47(40) +f —}]dr(t), if >0; 


while (11) is a consequence of (3 bis), (15) and (22). But (22) is equivalent 
to (12) in virtue of (11) and (4). This proves the second half of the state- 
ment italicized at the beginning of this section. 

In order to prove the converse, notice first that if the distribution function 
7(#) in (12) is the function 7*(¢) defined by 


(23) r*(t) = 44 4sgn(t—1), 


the corresponding distribution function o in (12) is precisely o*. It is well 
known that «* is a distribution function of class Q; in fact, the function 
o=o* is known to belong, in virtue of (5,,) and (8), to the function 
pr p*n 

n n : 
(24) p*,.(r) = II (27) where <1 

k=1 k=1 
(Gauss, Bravais, Maxwell; also Schoenberg 17). Hence, it is seen '* that the 
function (12) is the 1-dimensional projection of the n-dimensional radially 
symmetric distribution function ¢, belonging to 


17 Schoenberg [15], p. 817 (top). 
18 This statement is an obvious consequence of (10) and the fact that if 74, 72, 73 
are three distribution functions such that 7,(0) =7.(0) =7;(0) =0, then 


00 00 
f T,(a/t)dr,(t) = T;(a/t)dr,(t) 
0 


oo 00 co 
0 0 0 e 


The first of these relations is merely an integration by parts; the second clearly is true 
if 7; is a step-function, so that the relation holds in general, in view of the definition 


and 


of Stieltjes integrals. 


766 PHILIP HARTMAN AND AUREL WINTNER. 


oO 
(25) pn(r) 
in virtue of (2,). This completes the proof of the italicized statement. 


_ 8. In the sequel, it will be necessary to make use of the fact that if o 
is a distribution function of class 2, the function 7 occurring in (12) is unique, 
This is easily proved by using Fourier-Stieltjes transforms; cf. the remark in 
§ 1 concerning pz and pn. Incidentally, the uniqueness of 7, when combined 
with standard application of the theorem of Helly, obviously implies that (15) 
is valid without applying any selection, i.e., by placing m, =n. 

The problem of replacing the sheaf of normal distribution functions 
o*(x/t) in the representation (12) of a distribution function of class 2 by a 
sheaf of arbitrary distribution functions w(z,y) of class 2 will now be con- 
sidered. Let r(t,y) denote a function which is defined for 0=t< a, 
0=y< o in such a way that, for every fixed t= 0, r(t, y) is a Baire func- 
tion of y, Oy < @; and that it is, for every fixed y= 0, a distribution 
function, i.e., a non-decreasing function satisfying the boundary conditions 
7(0,y) =0, r(-+ ~©,y) =1. Let w(z,y) denote, for a fixed y, the distribu- 
tion function of class 2 corresponding to z(t, y) in virtue ‘of (12), so that 


(26) w(2,y) = f, y). 


It is clear that (2, y) is, for a fixed x, a Baire function of y (= 0). As above, 
it can easily be shown.that if é(t), —0o <t<-+ o, is a distribution func- 
tion satisfying (0) —0, then the distribution function 


(27) o(z) = t) dé (t) 


is a distribution function of class 2. On the other hand, it will be proved that 
if o(x) is a distribution function of class Q associated with the function z(t) 
in virtue of (12), then o(x) has a representation of the form (27) tf and only 
if there exists a distribution function €(t) such that €(0) = 0, 0) =1 
and 


(28) r(t) f y)d&(y). 


Suppose first that o(z) has a representation of the form (27). Define a 
sequence of distribution functions 7”(¢), which tend to é(t) as m— % and 
which are of the form 


7™(t) dimt* (t/him), (dim 0, him > 0, dim 3}, 
i=1 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 767% 


where r* is defined in (23); so that by (27) and the definition of Stieltjes’ 
integrals, 


m 
(29) (2) —lim w(a, t)de™(t) = lim ajmo(2, him) 
m—>OO e/ 0 i=1 


0 


i=1 


— for 


Hence, (28) follows from (12) in virtue of the uniqueness of the distribution 
function 7 in (12). And also the converse of the italicized statement follows 
from (29), since the preceding steps are obviously reversible. 

Suppose that the sheaf of distribution functions w(z,y) has the property 
that, for some fixed L > 0, the function o*(z/L) may be represented in the 
form (27); so that there exists a distribution function &,(y), such that 
&,(0) = 0, —1, and 


(30) o* («/L) — t) dé; (t). 


Then, by the italicized statement just proved, the function 7 = 7*(t/L), which 
corresponds to (30) in virtue of (12), satisfies 


Let y= denote an arbitrary point in the spectrum ’® of é,(y), and let 
t<L,e>0. Then, by (81), (23), 
TL+e 


= + —é&,(T4— fininf r(t,y). 
TL-eSyST Lie 
Hence, 
lim inf r(t,y) =O if 
yoTL 
It follows that there exist a distribution function 7,(¢), —«0o <t{#<+ 0, 


and a sequence of positive numbers 7, such that 7, —> T¥% and such that 
Tn) >74(t), as n—> finally, =0 if 


Thus, by the term-by-term integration theorem of Helly, 


7” A point is said to belong to the spectrum of a function if the function is not 
constant in any interval containing this point in its interior. 


E 
) 


768 PHILIP HARTMAN AND AUREL WINTNER. ca 


lim w(2, Tn) = f, o*(a/t)dr(t),  (Tn—>T). 
n->00 L-0 
It is similarly shown that there exist a sequence of positive numbers 7”, such 
that 7’, and a distribution function 7.(/) such that 7.(¢) = 1, if > L 
and 7(t, 7’,) > 72(t) as n—> so that 


L+0 
lim o(2, T's) w(x/t)dro(t), 
0 


n->0O 

It follows that if the distribution function w(a,y) tends to w(x, Yo) as 
Y—> Yo (in the sense of monotone functions) for every yo, OS yo < ~, then 
every distribution function of class Q may be represented in the form (27) 
if and only if there exists for every L > at least one T=T" such that 

Suppose, in particular, that the sheaf of distribution functions w(., ¢) 
is of the form w(z,t) =w(a/t), where o(2) is an arbitrary distribution fune- 
tion of class 2; so that, by § 2. 


holds for a suitable distribution function 7 = 7,, satisfying (11). A distribu- 
tion function o which may be represented by means of » in the form 


oc 
a(x) w(a/l)dé(t), z= 0, 


where é(¢) is a distribution function satisfying €(0) = 0, will be said to be 
of class Q(w). In this particular case, the preceding results are seen to be 
to the effect that a distribution function (12) ts of class Q(w) if and only if 
there exists a distribution function (1) which vanishes at t =0 and satisfies 


(32) r(t) — (t/y)dé(y) 


furthermore, every distribution function of class Q is of class Q(w) if and only 
if there exists a positive number T® such that =o*(x4), 
<+o. (The italicized statement of § 2 is the particular case T¢ = 1). 

If, in addition, use is made of the Stieltjes-Fubini relation which is the 
second formula of footnote 18, one sees that if the distribution function o(2) 
is of class Q(w) and if the distribution function p(x) is of class Q(a), then 
p(x) is of class Q(o). 


4. As an application of these statements. consider the symmetric stable 
distribution functions; that is, the distribution functions whose Fourier- 
Stieltjes transforms are exp(— | u|7), 0 << yX2 (the distribution function, 


} — 
4 
| 
| 
i 
/ 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 769 


whose Fourier-Stieltjes is identically 1, also is symmetrie and stable, but will 
be excluded as trivial). It is known *° that these distribution functions are 
of class 2. Let oy(2) be the distribution function whose Fourier-Stieltjes 


transform is exp(—- | u|7),0 << y2. Thus, there exists a distribution func- 


tion 7¥(¢) such that 77(0) =0 and 
CO 

(33) oy(z) where o* = on, 
0 


if o* denotes the same distribution function as in (13), except that the unit 
of x is different, 


(13 bis) = 4 


The relation (33) implies that 


(34) exp(—|u|7) exp(— | ut |?)drV¥(t), cuc+o. 


This merely states that the Fourier-Stieltjes transform of the function on the 
left of (33) is the same as the Fourier-Stieltjes transform of the function on 
the right. 

On replacing | w| by | w [87,0 < BSyS 2, and changing the integra- 


tion variable from ¢ to ¢9/7, one can write (34) in the form 


(35) exp(—|w|%) = exp(— | ut —w cu<cto, 


or 


Since B, y are arbitrary (0 << B= yS2), this relation is equivalent to the 
first half of the statement: oa(2) is of class Q(og) if and only if aS Bp. 

To prove the second half of this statement, suppose that oq is of class 
X(og), B<a=2; so that there exists a distribution function 


which vanishes for 4 0 and satisfies 


°°Wintner [21]. This result was rediscovered by Schoenberg [16], pp. 532-533 
(ef. Blumenthal [1]), who used methods equivalent (cf. Haviland [8], II, p. 382) to 
those applied loc. cit. [20], where the proof, in fact, was based on the multidimensional 
analogue of Lévy’s continuity theorem (cf. Haviland [8], IL). Incidentally, cf. Wiener 
and Wintner [19], pp. 241-242. 

It may be mentioned in this connection that Theorem 3 of Schoenberg [16] is merely 
a corollary of the classical representation of the infinitely divisible laws which is due 
P. Lévy (who, in fact, does not assume the symmetry of the distributions). 


4 
= 


770 PHILIP HARTMAN AND AUREL WINTNER. 


co 
(37) oa(2) op(x/t) drap(t). 
0 
Then, by the same reasoning which deduced (36) from (33), 


(38) = (2) — 


This would imply that there exists a positive number T = T(o2g/2) such that 
o2p/a(x/T )=o2(x), or exp(— | uT 
This contradiction establishes the theorem. 

The theorem just proved and the last italicized theorem of § 3 imply that 
if a < B, the class Q(oq) is a proper subset of class Q(ag). 


5. The standard methods described at the beginning of the Introduction 
represent asymptotic approaches to the normal distributions. Another ap- 
proach to these distributions is connected with the well-known fact that the 
stable distribution a, has a finite standard deviation only ‘a the normal case 
y= 2. In what follows, there will be considered still another approach to 
the normal distributions of radial symmetry. This approach might be of 
interest in view of Maxwell’s deduction of his distribution law of velocities. 

Let ¢, be an arbitrary radially symmetric distribution function on the 
Zuclidean space R,. Let d; be the k-dimensional projection of dn, finally 
pn; px the corresponding functions (2,), (2%). Since the function (9) is 
absolutely continuous, it follows from (10) that if k <n, then pn is absolutely 
continuous on the open half-line (0,-+ ©); so that 


(39) =px(+0) + (dp.(r)/dr)dr, «>0; (k=1,---,n—1). 
e/ 0 

Hence, the k-dimensional distribution function where k 1,---,n—1 
may be decomposed into a linear combination of two k-dimensional distribution 
functions ¢x/, of radial symmetry, 
where ¢;/(E,°) = 1 if denotes the Borel set consisting of the single point 
which is the origin of R,, and ¢,// is absolutely continuous; so that there exists 
a non-negative function 

& = 8, (2, ° * = |4) 


of the position (2,,- -,2«) in Ry for which 
(41) (Ex) +- |2) da, - + dry. 


It follows from (2), (39) and (40) that A= px(+ 0) =pn(+ 0) and 


; 
i 
» 
| 
q 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 771 


(42) (7) (for almost all r > 0), 


(2r)#*/I'(4k) being the Euclidean measure of the boundary of the k-sphere 
of radius 1. 
Define, for 0 <r < o, a non-decreasing function, %(r), k=1,---,n, 


by placing 


(43) =1— dn(2), 0. 
Thus, for ==1,---,n, 

(2ar) 
also, for k =1,:--,n—1 
(45) (7) (for almost all r > 0) 


The relation (45) is meaningless for k =n, unless the arbitrary function dp 
has a decomposition similar to (40), implying the existence of a 3n. 

It will be shown that tf gn is an n-dimensional radially symmetric dis- 
tribution function such that for some fixed k (0< k <n), and for some pair 
of positive constants c, C, one has 
(46) vn(cr) = C*vnx (1), r> 0, 
then oy 1s a radially symmetric normal distribution except for a possible jump 
at the origin. This means that ¢n may be written in terms of a non-negative 
constant A = 1 in the form 
(47) gn = + (1L—A) gn”, OSASI, 
where ¢,/(H,°) =1, and ¢,// is a radially symmetric n-dimensional normal 
distribution. 

In order to prove this theorem, note that, in view of (43) and (46), the 
absolute continuity of pn for r > 0 implies the absolute continuity of v1, vn, 
pn for r > 0. Thus, under these conditions, equations similar to (40), (41), 
and (45) hold fork =n. Since ¢n+// is the projection of ¢n!/, it is clear that 


+00 +00 

-00 

so that 


if denotes the common value of 8,(r) and (r/c) ; ef. (45) and 
(46). Since & is a density, repeated integration of (49) shows that 


+00 4 
-00 -00 


C22 PHILIP HARTMAN AND AUREL WINTNER. 


for every positive integer m. It follows, therefore, from (48) and (49) that, 
up to a factor of proportionality depending on m, the functions C*8(r), cé(cr) 
are the densities of an m-dimensional radially symmetric distribution function 
and its (m—k)-dimensional projection, respectively (unless 8(r) =0 for 
every 7 > 0; but this trivial case may be discarded, for in this case the state- 
ment (49) is satisfied by A=1). Thus, one can introduce a 1-dimensional 
distribution function by placing 


+00 

-00 
Clearly, this o(z) is, for m=1,2,:--, the projection of an (mk + 1)- 
dimensional radially symmetric distribution whose density is proportional to 
8(r). Consequently, by (12). 


where 7(/) is a distribution function satisfying (11). Since (50), (51) and 
(13) imply that 


it follows from (49) that 


0 


On integrating this relation between r= — o and r = 2, one sees from (50) 
and (13) that 


o(cx) o* (2/t)t*dr(t). 
0 
It follows, therefore, by comparison with (51) that 


(53) r(cy) Hdr(t), OSy<o. 


0 
But it is clear that (53) cannot hold unless ¢ = 1; in which case 
= 7*( (27) C2), 


where r* is defined as in (23). Hence, from (52), 
-00 
This completes the proof of the last italicized statement. 


6. It is clear from the proof and from the Helly theory of monotone 
functions that the theorem just proved may be generalized as follows: Let 


£ 


= 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 773 
a(x) be the 1-dimensional projection of an n-dimensional radially symmetric 
distribution function dn, n=1,2,---. Let wn(r) be the function (43) 
associated with dn, and suppose that there exists a non-decreasing function 
v(r) which is the limit of the sequence of functions 


in the sense of the theory of monotone functions. Then there exist two con- 
slants c, C (0 << C, 0 S 2C/a*) such that 


y(r) = exp (— dx. 
0 


It is also clear from the above proof that an absolutely integrable solution 
8(r) of the equation (49), i.e., of the Abel integral equation 


cé(cr) = 8(z) (x? — r?) 8-2) dr, 


exists only if ¢ = 1; in which case it is proportional to exp(— C?*r?/4z). 

@. The italicized statement of §5 implies a characterization of the 
n-dimensional distribution functions which are product distributions with 
respect to every codrdinate system. 

In $1, the k-dimensional projection of an n-dimensional radially sym- 
metric distribution function was defined by considering a k-dimensional hyper- 
plane R;, through the origin of the n-dimensional Euclidean space R,. Because 
of the radial symmetry, the projection was independent of the choice of the 
hyperplane F,. It is clear that if one considers an arbitrary n-dimensional 
distribution function %,(,) (not necessarily of radial symmetry), one can 
obtain a sheaf of k-dimensional projections ¥(/;,; R;). where the argument 
of the distribution function y, is a k-dimensional Borel set 4, on the hyper- 
plane R, on which wn(H£n) is projected. An n-dimensional distribution func- 
tion y%,(Z,) is said to be a product distribution * if there exists a positive 
integer k <n, a k-dimensional hyperplane fi, and its (mn — &)-dimensional 
normal hyperplane such that if = is an n-dimensional Borel 


set whose projections on Ry, are Ly, respectively, then 
(54) Wan ( En-x) = Wh ( Ry) Wn-k (Ln-x 3 


The Fourier-Stieltjes criterion for ¥, to be a product distribution is that there 


*1 This is slightly more general than the usual concept of a product distribution, in 
which (54) is replaced by , 
(El, xX... X Bn) R,) ... Rn,), 
where R1.,..., Rn, are n mutually perpendicular lines. A distribution which is a 
product distribution in this sense is clearly a product distribution in the sense of (54), 
but not conversely. 


V4 PHILIP HARTMAN AND AUREL WINTNER. 


exists at least one rectangular codrdinate system in R, with reference to which 
the Fourier-Stieltjes transform A(u;,--*,Un) Of wn can be written as the 
product 


(55) A (141, Us, ° > Un) == A(t, ° 0, 0)A(0,- 0, ° Ma). 


(In this codrdinate system, the hyperplane R, in (54) is defined by the 
equations 2,, = 0,° - -,2%, 

It will now be proved *? that if ¥n(H,), where n = 2, is an n-dimensional 
distribution function, then, for a fixed k, (54) holds for every pair of or- 
Lhogonal hyperplanes Ry, Rn+ if and only tf either Wn(E,°) =1 or there exist 
constants a (> 0), +, bn such that 


(56) — 


It is understood that £,° denotes the Borel set consisting of one point in Ff, 
{not necessarily the origin). 


exp[—a? (x—by)?]dx,- dap. 
j=l 


The first half of the theorem is trivial. In order to prove its second half, 
let P = (ui,° * -, Un) bea point in the space of the Fourier-Stieltjes transform 
A(u,° - *,Un). Then the assumptions of the theorem imply that 


(57) A(P) = A(P*)A(P**), 


where P*, P"-* are the projections of P on an arbitrary pair of orthogonal 
k- and (n —k)-dimensional hyperplanes through the origin of -,Un)- 
space, respectively. 

Suppose first that the distribution function y, is symmetric with respect 
to the origin of Rn, i.e., *,Un) It will be 
shown that y, is then of radial symmetry. In fact, let P,Q be two distinct 
points on any sphere with its center at the origin O of the (w,° - -, Un)-space. 
Consider the plane POQ, the pair of lines which bisect the angles formed by 
the lines OP and OQ, and a pair of orthogonal hyperplanes containing these 
lines and having the dimension numbers / and n —k, respectively. Then 


A(P) =A(P*)A(P™*) and A(Q)-=A(Q*)A(Q"*), 


where P*, P»-“, (*,Q"~ are the projections of P and Q on these hyperplanes, 
respectively. It is clear that if the points P*,Q* do not coincide, then they 


*2 This problem was considered by Maria-Pia Geppert, “ Una proprieta charatteris- 
tica della distribuzione de Bravais,” Giornale dell’ Istituto Italiano degli Attuari, 
vol. 7 (1936), pp. 378-391. Her considerations were recently rediscovered by M. Kac 
[10]. Actually, the final result of Kac is incorrect, since his conclusion is that either 
y, has a jump of 1 at the origin or (56) must hold with b, =b,---=6,, =0. In 
the case of polar symmetry, Kac used Cauchy’s functional equation, which will now be 
avoided by applying the theorem of § 5. 


im 


cr 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 7 


are situated symmetrically with respect to the origin O; the same holds for 
the pair P"-',Q"-", In virtue of the polar symmetry of wn, it follows that 
A(P) =A(Q), which establishes the radial symmetry of Yn. 

Since Re), Rn«) are projections of the radially sym- 
metric distribution yn, they are absolutely continuous on Ry, Ry-x, respectively, 
if the origin is removed. It follows, therefore, from (54) that y» is absolutely 
continuous on F, with the origin removed, and that the density 8,(2,°-* , @n) 
of yn then is the product of the densities * 
of y, and wn-~ Consequently, the radial symmetry of yn» implies that there 
exist a function 8(z7), <x<-+ ~%, and two positive constants ¢,, 
such that 


and 


Thus, it is easy to see that the assumptions of the theorem of § 5 are 
satisfied ; so that Ww» is a radially symmetric normal distribution except for a 
possible jump at the origin. However, the product condition implies that the 
jump at the origin is either 0 or 1. This concludes the proof of the last 
italicized statement in case y is symmetric with respect to the origin of Fn. 

In order to complete the proof, consider the n-dimensional distribution 
function @n(£,) whose Fourier-Stieltjes transform is 


Then ¢,(/,,) is symmetric with respect to the origin and satisfies the condi- 
tions of the theorem. Hence, ¢n(/,) either is a radially symmetric normal 
distribution or ¢, has a jump of 1 at the origin. Thus,?* 


0OSa< oa. 


It follows that A does not vanish for any Un); so that 
(58) >, Un) = exp[—a?(u? un?) + Un) | 
holds for a suitable continuous function g(u;,° * *,Un) which satisfies the 
condition 
g(U1,° 5 Un) =—g(— — Un). 


*2 The balance of the proof could be based (cf. Kae [10]) on an application of a 
theorem formulated as a conjecture by Lévy and subsequently proved by Cramér [6]. 
But this rather deep theorem, for which only a complex function-theoretical proof is 
available today, may be avoided in this case. 


$$ 


776 PHILIP HARTMAN AND AUREL WINTNER. 


It follows from (57) and (58), by choosing 


P == (u;,° Un); Pk = -,0), (0, U2,’ * Un). 
that g(t,° °°, tn) = 0,- - -,0) + 9(0, -, un), 
(59) 9(U1,° Un) = +--+ + gn(un), where 
= .9(0,- - -,0, us, 0,- -,0) 
is a continuous odd function of uj. 
On applying (57), (58) and (59) to P = (u? + v*,0,° + -.0). 
Pk = (u?,uv,0,---,0), = (v?,— 0,-- -,0), 


one obtains 

gi(u? + v*) = gi(u*) + gi(v*), 
if use is made of the fact that g. is odd and gi(0) =0. This implies that 
there exists a constant c,; such that g,(u*) = c,u*; so that, since g, is odd, 


gi(u) =c,u. Similarly, gj;(w) = cju for = 2,- Hence, (58) reduces, 
in view of (59). to 
* *,Un) = exp[— (a?uj? — cju;) ]. 
j=l 


But since A is a Fourier-Stieltjes transform of a distribution function, 
| A | = 1, so that the constants cj; are purely imaginary, i.e., ¢; = ib; and 

n 
(60) Un) = exp[— (a?uj? —ibjuj)], OSa< ow. 

j=l 
Since (60) is known to be the Fourier-Stieltjes transform of an n-dimensional 
distribution of the particular type mentioned in the theorem, the proof is 


complete. 


8. For a positive number p which need not be an integer, let S,” be the 


solid characterized by the inequality 


(61) 8,9: 
j=l 
in the Euclidean space Ry: (@1,° It will be shown that if An”(z), 


om, denotes the one-dimensional distribution function which 
one obtains by projecting on a codrdinate axis of Ry the n-dimensional equi- 
distribution on S,?, the density of probability of An? ts 
const. (1 — | z |?) if | < 1, where 
(1+ n/p) 


d const. = 
©) 
0,if|2|>1. 


| 

n 

i 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 


In order to prove (62), let n and p be fixed, and let s denote a continuous 
parameter which varies between 0 and 1. A straightforward homogeneity 
consideration shows that the n-dimensional volume of that infinitesimal portion 
of the solid (61) which lies between the two hyperplanes z, = s, 2, =s + ds 
is proportional to (1— s?)4ds, where ¢g = (n—1)/p. Since the whole solid 
(61) is contained between the two hyperplanes 2, -+ 1, and is symmetric 
with respect to the hyperplane x, = 0, it follows that (62) holds for some 
const. > 0. Finally, the value of this constant is obvious from 


— | a |?) = — — y) (-1)/pyl/p-1 (* —! 
fia x |?) dx (1—y) dy 
and from the fact that the total probability represented by An?(x), — 0 <2 
<-+ ,is unity. This completes the proof of (62). 

A corollary of (62) is that if S,?(7) denotes, for a fixed > 0, the solid 
which one obtains by writing 7? instead of 1 on the right of the inequality 
(61), the projection on a coordinate axis of the equidistribution on S,?(n/?) 
tends, as n—>» «, to the distribution function which has a density proportional 
to exp(—| |?/p) for <r<4+o. 


In fact, if p is fixed and n— o, then 


(Stirling) ; while 


| 


P\ (n-1)/p | ar |p 
) = exp (— ) for 


Hence, from (62), 


63) lim — —A,? | —o ‘ 
But it is clear for reasons of homogeneity that, if r > 0, the distribution func- 
tion 7A,?(2/r), +o, belongs to S,”(r) in the same way as 
A’(z) belongs to S,?—S,2(1). Hence, (63) is equivalent to the last 
italicized statement. 


temark. If L,?(u), <u< + denotes the Fourier transform 
Of An?(x), <x< +o, then, according to (62), 


| 
n ni/P 
6 


PHILIP HARTMAN AND AUREL WINTNER. 


~ 
~ 
co 


It is seen from the integral definition of the Bessel functions Jy(w), that 


(64) reduces for p= 2 to 


(65) Ly? = J*in(u)/T*n(0), 
if J*,(w) denotes Jy(|u|)/| w|’. On the other hand, if Z,?(u) denotes the 
Fourier transform of the distribution function An? (2), 


which belongs to the equidistribution on the boundary S,?:%2;?=1 of 
j=l 


n 
S,?: aj? = 1 in the same way as A,”(x) belongs to S,? itself, then, as men- 


tioned in the Introduction, it is well known that 
(66) L,,?(u) == (u) /J*yn-1 (0). 


Since comparison of (65) and (66) shows that L*n,.(u) = Dn?(u), it follows 
that 


In other words, the distribution which is the projection on a diameter of the 
equidistribution on the interior of the n-dimensional unit sphere is identical 
with the distribution which is the projection on a diameter of the equidistribu- 
tion on the boundary of the (nm + 2)-dimensional unit sphere.** (Needless to 
say, this fact may be verified also by calculating the volumes of the spherical 
segments involved.) Actually, the explicit relation (67) may be interpreted 
as an essential refinement of a known phenomenon in functional analysis; ” 
that is, of the fact that, as n—» © an overwhelming portion of the sphere 
2,7 <1 concentrates on its boundary 2,?=—1. 


QUEENS COLLEGE, 
THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES 


[1] Blumenthal, L. M., “ Distance geometries,” University of Missouri Studies, vol. 13 
(1938), no. 2. 

[2] Boltzmann, L., Vorlesungen iiber Gastheorie, vol. 1 (1896) ; vol. 2 (1898) ; Leipzig. 

[3] Borel, E., Mécanique statistique classique, (Paris), Gauthier-Villars, 1925. 

[4] Borel, E., Introduction géometrique & quelques théories physiques (Paris), Gau- 
thier-Villars, 1914. 


24 Cf. Borel [4]; Lévy [12], [13]; Wierer [19]. 
25 Cf. Borel [4]; Lévy [12], [13]; Wiener [19]. 


i || 
n 
te 
| 
| 
| 


At 


[5] 
[6] 


[7] 
[8] 


[9] 


[10] 
[11] 
[12] 
[13] 


[14] 


[15] 
[16] 
[17] 
[18] 


[19] 


[20] 
[21] 
[22] 


[23] 


SPHERICAL APPROACH TO THE NORMAL DISTRIBUTION LAW. 79 


Borel, E. and Deltheil, R., Probabilités ; erreurs, 4th edition (Paris), Colin (1934). 

Cramér, H., “ Ueber eine Eigenschaft der normalen Verteilungsfunktion,” Mathe- 
matische Zeitschrift, vol. 41 (1936), pp. 405-414. 

Deltheil, R., Probabilités géometriques (Paris), Gauthier-Villars (1936). 

Haviland, E. K., “On the inversion formula for Fourier-Stieltjes transforms in 
more than one dimension,” American Journal of Mathematics, vol. 57 
(1935); I, pp. 94-100; II, pp. 382-388. 

Jessen, B. and Wintner, A., “ Distribution functions and the Riemann zeta func- 
tion,” Transactions of the American Mathematical Society, vol. 38 (1935), 
pp. 48-88. 

Kae, M., “ On a characterization of the normal distribution,” American Journal of 
Mathematics, vol. 61 (1939), pp. 726-728. 

Lévy, P., “Théorie des erreurs. La loi de Gauss et les lois exceptionnelles. 
Bulletin de la Société Mathématique de France, vol. 52 (1924), pp. 56-58. 

Lévy, P., Legons d’analyse fonctionnelle (Paris), Gauthier-Villars (1922), pp. 262- 
268, 274-284. 

Lévy, “Analyse fonctionnelle,” Mémorial des Sciences Mathématiques, fase. 5 
(1925), pp. 39-40. 

Schoenberg, I. J., “On certain metric spaces arising from Euclidean spaces by a 
change of metric and their imbedding in Hilbert space,” Annals of Mathe- 
matics, vol. 38 (1937), pp. 787-793. 

Schoenberg, I. J., “ Metric spaces and completely monotone functions,” Annals of 
Mathematics, vol. 39 (1938), pp. 811-841. 

Schoenberg, I. J., “ Metric spaces and positive definite functions,” Transactions of 
American Mathematical Society, vol. 44 (1938), pp. 522-536. 

Sommerfeld, A., ‘ Eine besonders anschauliche Ableitung des Gaussischen Fehler- 
gesetzes,’ Boltzemann-Festschrift, Leipzig (1904), pp. 848-859. 

Wiener, N., “ Differential space,” Journal of Mathematics and Physics, Massa- 
chusetts Institute of Technology, vol. 2 (1923), pp. 131-174. 

Wiener, N. and Wintner, A., “On singular distributions,” Journal of Mathematics 
and Physics, Massachusetts Institute of Technology, vol. 17 (1939), pp. 
233-246. 

Wintner, A., “Upon a statistical method in the theory of diophantine approxi- 
mations,” American Journal of Mathematics, vol. 55 (1933), pp. 309-331. 

Wintner, A., “On a class of Fourier transforms,” American Journal of Mathe- 
matics, vol. 58 (1936), pp. 45-90 and p. 425. 

Wintner, A., Spektraltheorie der unendlichen Matrizen, Leipzig (1929), pp. 81- 
83, 88-91. 

Wintner, A., “Spherical equidistributions and a statistics of polynomials which 
occur in the theory of perturbations,” Strémgren-Festschrift, Copenhagen, 


1940 (in press). 


le 
yf 
e 
0 
5 


ON UPPER LIMIT RELATIONS FOR NUMBER THEORETICAL 
FUNCTIONS.* 


By Puitip and RicHarRD KERSHNER 


There are, in the literature, several results on the limit superior of number 
theoretical (i.e., additive or multiplicative) functions, giving results of the 
following nature: 

(1) lim sup f(z)g(z) = 1, 
L 


where f(a) is a number theoretical function and g(x) is elementary. All these 
results have in common the fact that the functions f(z) and g(x) considered 
are of such a nature that 
(2) lim f(7n) g(t) = 1, 
n->0O 
where 
Tr = Pn 


is the product of the first n primes. 

The purpose of this note is to delimit a simple class of functions for which 
results of this nature can be obtained. This possibility was suggested to us by 
Professor Wintner. The greater portion of the paper will deal with additive 
functions; although multiplicative functions may, of course, be treated by 
applying these results to their logarithms, this consideration leaves something 
to be desired. since from 


log f(t) S (1 +8)/9(2), «> X(8), 
one can infer only 


f(x) S exp [(1 + 8)/g9(2)], z>X(d), 
and not 


f(x) S (1 +8) exp (1/g(2)), «> X’(8), 


which would be needed to prove a corresponding limit relation. Corre- 
spondingly, the direct treatment of the multiplicative case seems to be more 
difficult than that of the additive case, and we were unable to establish for 
multiplicative functions a result of generality comparable to that obtained for 
additive functions. Thus we have confined the consideration of multiplicative 


* Received March 14, 1940. 
780 


1 

| 

| 

| 

( 

( 

il 
( 
I 
t 
I 
g 
t 
a 
f 

(s 
th 
f( 


ON UPPER LIMIT RELATIONS FOR NUMBER THEORETICAL FUNCTIONS. 781 


functions to one very simple case; which, however, does imply the known limit 
result for the Euler ¢-function. 

The treatment will be based on a very simple lemma stating the Tauberian 
conditions needed in order to infer (1) from (2). 


LemMA. Let f(z), 0<a<+o, be a real-valued function of the 
integer x, and let {rx} be a sequence of integers, with the following properties: 


(i) te < (4 =1,2,---), 
(ii) as koa, 
(iii) /f (tr) 21, as ko 


(iv) for every 5 > 0 there eaists an N = N35 such that 
(3) f(z) S (1+ 8)f (tm) whenever and n>N., 
Let g(x) be a non-increasing function such that 


(2 bis) f(tn)g(%m) 21, as n> 
Then 
lim sup = 1. 
F 10, @) 


In order to prove this lemma, notice that the conditions (iii) and (2 bis) 
imply that 
(iii bis) 9(Tn-1)/9(1n) as 


If x and n are integers such that the relations 7n-. <7, and (3) hold, 
then, in virtue of the monotony of the function g, 


S (1+ 8) (1) 9 (1%) /9 J. 


Hence, it follows from (2bis) and (iiibis) that lim sup f(z)g(z) is not 
greater than 1. On the other hand, (2 bis) alone implies that it is not less 
than 1. This completes the proof of the lemma. — 

Before proceeding to the general class of additive functions mentioned 
above, to which this lemma is applicable, two special cases which become im- 
mediately obvious when thought of in connection with this lemma will be 
mentioned. These are the cases of strongly additive and strongly multiplicative 
functions. An additive (multiplicative) function is called strongly additive 
(strongly multiplicative) if f(p”) =f(p) for ally =—1,2,---. (Throughout 
the paper p will denote a prime and p, the n-th prime.) 


I. Let f(x), 2 be strongly multiplicatwe, so that 
=f (pe), f (pipe) =F (pe), GAR). Let f(per) = > 1 as 


782 PHILIP HARTMAN AND RICHARD KERSHNER. 


k—»«. Then the conditions (i)-(iv) of the Lemma are satisfied by the 
SEQUENCE Tn = Pipo* * * 

The proof is obvious, in fact (3) is satisfied for N 1 and §’—0. It 
should be mentioned that the requirement of monotony, f(px+) = f(px), 
cannot be dispensed with. This can be seen by the example 


f (p2") on + 1/n, 
f (pe) =1 if kA 2 for any n. 
In spite of the simplicity of Theorem I, the known case? of the function 


f(z) =2/$(x), where $(a) is the Euler ¢-function, may be treated as a 
particular case of this theorem. In fact, x/(x) is strongly multiplicative and 


p/$(p) = (1—1/p)*, 


so the conditions of Theorem [I are satisfied. Consequently, the relation 


> "Dn | 
* Pn) log log (pipe: pn) 


(where C is the Euler constant), which is a consequence of Merten’s asymptotic 


formula 
II (1—1/p) ~ log 


and Chebyshev’s: inequalities, implies by the Lemma, that 


1j 1 
eC logloga 


The corresponding theorem for the additive case is the following: 


THEOREM IJ. Let f(x), e—1,2,:--, be strongly additive, so that 


f ( Pe’) =f (pe), (pipe) =f (pi) +f(pe), GAR). Let f(pera) =f (pe) > 


Then the conditions (i)-(iv) of the Lemma are satisfied by the sequence 
Tn = Pipe" * * Pn. 


The proof is again obvious. Notice, in connection with our earlier remarks 
on the comparative difficulties of the two cases that the requirements of this 
Theorem II are much weaker than those of the corresponding Theorem I. 
It might also be mentioned, in this same connection, that in this case the 
requirement of monotony can be considerably modified. 

' As an application of Theorem II, consider the strongly additive function 


f(n) =fa(n) defined by 


1E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig 
(1909), pp. 219-222. 


| 
| 
| 
a 
f 
| 
f 
Vv 
ay 
th 
al 
th 
| 


ON UPPER LIMIT RELATIONS FOR NUMBER THEORETICAL FUNCTIONS. %83 


fa(p) =— log (1—1/p*), fa = 3 fa(Pn). 
The conditions of Theorem II are obviously satisfied. Also, one has 
fa(?n) 3  (O< a <1)5 
but, in virtue of the tins number theorem,’ 
1/n* log n log a, 3), 


n=2 


so that (2 bis) is satisfied by the function 
g(n) = (1 — a) log log n/ (log n)*~*. 
Hence, by Theorem IT and the Lemma, 


lim sup fa(x) log log (log = 1/(1— (0< a< 1). 
The strongly additive function f,(7) in this relation may be replaced by the 
additive function log [o,(n)/n*], where o,(”) is the classical (multiplicative) 
function defined as the sum of the a-th power of the divisors of n. For 


which implies, for 0 << « < 1, that 
log [a(p*)/p'*] < fa(p*) =fa(p) and log [oa(tn)/tn*] ~ fa(tn). 
Consequently, the result obtained for fa(m) may be transcribed as 


lim sup log [oa(x) /x*] log log x/ (log x)*-* = 1/(1— a), (0< @a< 1), 


which was first proved by Gronwall* (using a refinement of the prime number 


theorem). 

As another example of the use of Theorem II, consider the function 
f(x) =p() defined as the number of distinct prime divisors of x. It is easily 
verified that this function is strongly additive. Since p(px) =1, the condi- 

2 This is a consequence of the standard procedure of writing 

[A(n) —O(n—1) log n, where log p, 
x 


= X<n<e 


p= 
applying the Abel summation formula to the last sum, and using the prime number 
theorem in the form (1—e)n < 6(n) < (l+e)n, ifn > X. (Cf.,e.g., loc. cit.7, p. 25.) 
°T,. H. Gronwall, “Some asymptotic expressions in the theory of numbers,” T'rans- 
actions of the American Mathematical Society, vol. 14 (1913), pp. 113-122. Gronwall 
also considers the functions o,(n)/n@ for 421. However, these cases are simpler 
than the ones treated above; in fact, they are easily handled in the multiplicative form, 
i.e., without resorting to logarithms. On the other hand, the upper limit is not 
approached on the sequence r, =P, Py* + + Py; as is the situation above, 


784 PHILIP HARTMAN AND RICHARD KERSHNER. 


tions of Theorem II are satisfied by this function f(z) =p(ax). Consequently, 

the relation 

log log (pip2* * * Pn) 
log (p:pz* * Pn) 


— 1, (n> @), 


p(Pip2" * * Pn) 
i.e., the relation 


(4) n log log (pip2* * Pn) 


log (pip2* * * Pn) 


1, (no), 


which is an easy consequence of the elementary inequalities of Chebyshev, 
implies, by the Lemma, that 


We now proceed to the main result. 


TueoreM III. Let f(x), c—1,2,---, be an additive function such 
that, for some «, <1, 
(6) (k =1,2,- +3 v= 2,3,° °°); 
and 
(7) f (pr) > 1, 
Then the conditions of the Lemma are satisfied with tn = pip2* * * Pn and 


g(x) = log log x/log z. 
Proof. The conditions (i)-(iii) are obviously satisfied. In order to prove 
(iv), let 8 > 0 be fixed and let 


Now (7) implies that 


n nN 


so that, for any y; > 0, and for sufficiently large n, 
(9) (1+ m)n 2 = (1—m)n. 
On the other hand, by (6), 

k 

f(z) = vm‘; 

m=1 

so that 
k 
f(t) (vm* pin) (108 


Wii 
on. 
for 


] 
0 
( 
f 
W 
la 
( 
fo 
(6 
an 
ne 


8 


or 


ON UPPER LIMIT RELATIONS FOR NUMBER THEORETICAL FUNCTIONS. 


It follows from the inequality of Holder that 


k k 
f(z) (3 vim log (3 ) 
Now, by (8), 
k n 
(11) Sta log Pim = > log Pm- 
m=1 


m=1 
On the other hand, by the inequality of Chebyshev, for any y: > 0 and for 


sufficiently large n, 


(12) Slog pm (1 + log n. 
m=1 
Also, for any y; > 0 and for sufficiently large n, 


n 
m=1 m=1 
If (11), (12), and (13) are substituted in (10), one has 
or 


(14) f(r) S (1 + + *n 
for n sufficiently large. Combining (9) and (14) gives 
f(t) S (1 + + 93) (tn), 
where 4; > 0, 72 > 0, 43 > 0 may be chosen arbitrarily small if n is sufficiently 
large. Thus, for any 6 > 0, there is an N5 such that 
(15) f(z) S (1+ 8)f(™) if n> Ns. 


This shows that the condition (iv) of the Lemma is satisfied in the present case. 
The fact that the function g(x) in the Lemma may be chosen to be 


g(x) = log log x/log x 
follows from (4), in virtue of (9). This completes the proof of Theorem ITI. 


It should be mentioned that, in view of (7), the requirement (6) of 
Theorem III may be replaced by the condition 


(6 bis) f ( Px”) = vf (pe); (v, k= 1, 2, 


and, in fact, in view of the asymptotic character of the result, (6) or (6 bis) 
need only be required for sufficiently large &. The same is not true, however, 
with regard to v and it is quite easy to construct an example where (6) fails 
only for vy = 2 but where the result (15) no longer holds if x is chosen of the 


form = * Dn”. 


786 PHILIP HARTMAN AND RICHARD KERSHNER. 


It seems that the requirement (6) or (6 bis) is somewhere near the best 
estimate of its kind which can imply (15). In fact, it is easily seen that if 


f( px’) > v/ (log v)**f(pu) for some > 0, 


then (15) fails if x is chosen to be a power of 2. 

An example which satisfies the conditions of Theorem III is the function 
f(x) =log d(x) /log 2, where d(x) =o (x) is the number of distinct divisors 
of xz. In this case f(x) is additive and 


f = log (v+ 1) /log 2, (v, k = 1, 2,° 


Thus, the result, 
lim sup log d(x) log log x/log x = log 2, 


n->0O 


due to Wigert,* follows from Theorem III and the Lemma. 


QUEENS COLLEGE, 
UNIVERSITY OF WISCONSIN. 


* Cf. loc. cit.1, pp. 219-222. 


( 
t 
U 
t] 
p 
d 
Vi 


ON THE PROPERTIES OF A COLLECTIVE.* + 


By Z. W. Brrnpaum and HeErsert S. ZUCKERMAN. 


1, R. v. Mises? gives the following definition of the simplest collective 
which he also calls an alternative: A simple collective is an infinite sequence 
of observations, the result of each of which may be represented by one of two 
symbols, say 0 or 1, which satisfies 


Postulate 1. If nm) and n, are the number of observations, among the 
first n, for which the results are 0 and 1 respectively, then the limits of the 


relative frequencies, lim no/n = and lim = w, shall exist; and 
n->0O 


Postulate 2. If an infinite subsequence of the total sequence is formed 
by a “selection ” then, for this subsequence, the same limits exist and their 


values remain unchanged, lim n’o/n = wo, lim n’;/n = w. 


The numbers w, and w, are called probabilities of the appearance of the labels 
0 and 1 in the collective. 

These postulates have become the object of considerable discussion. Most 
of these discussions have centred around the second postulate and a number 
of investigations have been made in attempts to prove the consistency of the 
concept of a collective, in connection with the difficulties encountered in inter- 
preting this postulate.® 

It is the aim of the present paper to prove that a sequence which fulfills 
the first postulate, fulfills also, generally speaking, the second postulate. The 
precise formulation of this statement is given in the following 


THeEorEM A, The set of all infinite selections can be interpreted as a 
space © in which a Lebesgue measure is defined, so that if a sequence of 0’s 


* Received February 21, 1940. 

1 Presented to the American Mathematical Society, February 24, 1940. 

2R. v. Mises, Wahrscheinlichkeitsrechnung und ihre Anwendungen in der Statistik 
und theoretischen Physik, Leipzig u. Wien 1931, p. 14. 

® Certain special cases of our Theorem A are included in some of these investi- 
gations e.g. in A. H. Copeland, “ Point set theory applied to the random selection of 
the digits of an admissible number,” American Journal of Mathematics, vol. 58 (1936), 
pp. 181-192. A special case is also formulated by H. Steinhaus, “Les probabilités 
dénombrables et leur rapport & la théorie de la mesure,” Fundamenta Mathematicae, 
vol. 4 (1923), pp. 286-310, especially p. 305. 


187 


788 Z. W. BIRNBAUM AND HERBERT S. ZUCKERMAN. 


and 1’s fulfills the first postulate of v. Mises, the second postulate is ful- 
filled for the subsequence determined by every selection with the exception 
of a set of measure zero in S. 


Theorem A follows from a more general theorem which will be formu- 
lated in the next paragraph. 

2. Let K be an infinite sequence (4),@2,---:) of 0’s and 1’s. The 

n 

number of 1’s among the first n elements of that sequence is }a;. To each 
i=1 

sequence K we ascribe the real number k = a,/2 + 

Let S be a selection which, if applied to a sequence K, preserves only the 
4,-St, t2-nd,- - - terms. The result of an application of S to K is, therefore, 
the sequence 4;,, ai,, - * which we shall denote by K C S, in accordance with 
a notation introduced by Copeland.‘ 

A selection S is completely described by a sequence (b;, bs, - +) where 
bi, = bi, =- - -=1, and b; = 0 for all other values of 1. We obviously have 


(1) n= > dj. 


We shall consider only selections S which preserve infinitely many terms of a 
sequence to which they are applied, i. e. selections S with 6; = 1 for infinitely 
many values of i. A one-to-one correspondence between the set S of all such 
selections S and all real numbers s of the intervall <0, 1> can be established by 
ascribing to the selection S =(b;, b2,---) the number s = b,/2 + b2/2?+---:. 
We introduce a measure in © by calling a set § in © measurable if and only 
if the set o of corresponding numbers in <0, 1> is measurable in the sense of 


Lebesgue, and by defining 
measure of = measure of = m/(o). 


The relative frequencies of the 1’s in K are 


n 
(2) fn(K) = nt 
while those in K C S are given by 
1 & 
(3) CS) =~ Zaidi 
n q=1 


THEOREM B. If F(K) is the set of points of condensation of the sequence 
fi(K), fe(K),: +, then F(K) =F (K CS) almost everywhere in G, i.e. 
for all S except those of a set of measure zero. 


* loc. cit. *. 


in 
| 


ON THE PROPERTIES OF A COLLECTIVE. i89 


Proof of Theorem B. We denote by ri(t), i= 1, 2,: - -, the well known 
Rademacher *® functions which are defined for 0St¢X1 as follows: if 
t=1,/2 + t,/2°-+--- is the infinite dyadic expansion of ¢, then rj(t) = 1 
if 4; = 1, and r;(¢) =—1 if t; =0. We evidently have 
(4) 3(ri(t) +1). 

Using as arguments for those functions the numbers k and s which correspond 
to K and S, we find, from (2) and (4), 


and, from (1), (3), and (4), 


(6) CS) = n 4=1 n 


The functions 1, (s),72(s),* are a normed orthogonal system. It is known 
that for such a system the relation 


(7) lim 0 

m= 

oun N 4=1 ( ) 
holds for almost all values of s in <0,1>. Similarly, the functions p(s) 
=1r(k)r,(s), p2(s) =re(k)re(s),: +, form a normed orthogonal system, 
and therefore we again have 
i 
(8) lim = Sri(k)ri(s) = 9, 
N->0o N 4=1 


for almost all s. From (5), (6), (7), and (8) we see that 
(9) lim {fn(K C 8S) —fi,(K)} = 9, 


for almost all S. Hence every point of condensation of the sequence 
{fn(K C S)} is a point of condensation of the sequence {f;,(K)}, and there- 
fore F¥(K C 8) is contained in F(K) for almost all S. 
We shall now prove that F(K C 8) also contains F'(K) for almost all S. 
It is easy to see that if F(K) contains two numbers r < u, then it also con- 
tains all numbers ¢ with r= t= u. We let a be the smallest and B the 
5H. Rademacher, “ Einige Sitze iiber Reihen von allgemeinen Orthogonalfunk- 


tionen,” Mathematische Annalen, vol. 87 (1922), pp. 112-138. 
°S. Banach, “Sur la valeur moyenne des fonctions orthogonales,” Bull. Ac. Crac., 


1919, pp. 66-72. 


790 Z. W. BIRNBAUM AND HERBERT S. ZUCKERMAN. 


largest number in F(K). It will suffice to prove that # and B belong to 
F(K C8) for almost all S. 
Let {In} be a sequence of such indices that lim f;,(K) =«. If @ is not 
n->CO 


contained in F(K C 8) for a certain S then it is not a point of condensation 

of {fn(K C S)} and, by (9), it is not a point of condensation of {fi,(K)}. 

Therefore only a finite number of the indices i, are equal to some Im, and, if 

S is (b;,b2.: - +), then we have b;,,=0 from a certain m on. We now let 

T, be the set of all S such that b;,,—0 for all m=r, and T=>T,. All 8 


r=1 
for which « is not a point of condensation of {f,(K C S)} belong to T. 
However, it is easily seen that each set 7’, is of measure zero, and hence T' is 
also of measure zero. From this we see that the set Hy, of all S for which a 
is not a point of condensation of {f,(K C S)}, is of measure zero. By the 
. same argument, the set Hg, of those S for which B is not a point of condensa- 
tion of the sequence {f,(K C S)} is, too, a set of measure zero. If Z is the 
set (of measure zero) of those S for which (9) does not hold, then 
i’ = <0, 1» — H — E, — Fg is of measure one and contains only selections 8 
for which both a and B belong to F(K C8). If both « and B belong to 
F(K C8) then F(K CS) contains every number between « and £B, and 
hence contains F(K). Since the measure of H’ is one, this completes the 


proof of Theorem B. 


3. Theorem B states that, for a fixed K, there is a set of measure one 
of selections S which leave the set of points of condensation of the sequence of 
relative frequencies invariant, i.e. F(K C 8S) —=F(K). 

The dual statement is also true: 7 for a fixed selection S and almost all K 
we have F(K) = F(K C8). To see this we note that by a classical theorem 
due to Borel,’ for almost all K, the set F(K) contains only the number }. 
On the other hand from (6) and (7) we find that, for a fixed S and almost 
all K, we have lim fn(K C 8) =}. 


The question may be asked whether it is possible to find a set M of 
sequences K and a set N of selections S such that each set is of measure one 
and that F(K) = F(K CS) for every K in M and every S in N. The answer 
to this question is negative as may be seen from the following argument: 

We first discard the set of measure zero of those K which contain only a 


7¥For a more general treatment of such “dual” problems i.e. those with a fixed 
selection and sets of collectives, see Z. W. Birnbaum and J. Schreier, “ Eine Bemerkung 
zum starken Gesetz der grossen Zahlen,” Studia Mathematica, vol. 4 (1933), pp. 85-89. 

®E. Borel, “Les probabilités dénombrables et leurs applications arithmétiques,” 
Rend. Cire. mat. Palermo, vol. 27 (1909), pp. 247-271. 


ON THE PROPERTIES OF A COLLECTIVE. 791 


finite number of 1’s. Now, if the same sequence of 0’s and 1’s is used 
for K and for S, i.e. if K = S, then K CS is a sequence consisting only of 
1’. Therefore, for every K,: we have fx(K CK) =—1, n=—1,2,---. 
Hence, if 1 is not a point of condensation of {fn(K)} we have F(K) 
AF(K CK). For almost all K the set F(K) contains only the number 3, 
therefore F(K) ~F(K CK) for almost all K. It follows that, if M is a set 
of measure one of sequences K, and N a set of selections 8 such that for all K 
in M and all S in N we have F(K) = F(K CS), then N must not contain 
any S = K with K contained in M, and therefore the measure of N is zero. 


UNIVERSITY OF WASHINGTON, 
SEATTLE, WASHINGTON. 


ON SYMMETRIC BERNOULLI CONVOLUTIONS.* 


By Tatsuo Kawata. 


1. LetA(t;o0),— «0 <t<-+ ow, denote the Fourier-Stieltjes transform 


(1) A(t3o) ettedg (a) 


of a distribution function —w Let B(x) denote the 
symmetric Bernoulli distribution, which has at either of the points z= + 1 
the jump 4; so that A(t; 8) —cost, and so the Fourier-Stieltjes transform 
of the distribution function B(z/b), where 6>0, is cosbt. Thus,’ the 
infinite convolution 

(2) a(t) = B(x/b1) * B(x/b2) * B(x/bs) where by > 0, 

is convergent if and only if 


(3) < 0, 

in which case 

(4) = cos 
k=1 


Wintner has obtained on the one hand * Gaussian estimates of 1 — o(z) 
and o(—~7) for large x > 0 in case of an arbitrary {b,} satisfying (3), and 
on the other hand* almost Gaussian estimates of A(t;0) = A(—t;o) for 
large t > 0 in case {bx} is suitably chosen (e. g., b, = k4**, « > 0); he has 
also pointed out? the relation of these estimates to a conjecture of Wiener, 
proved by Hardy.* The object of this note is a precise investigation of this 
relation. 

2. First, if {b.} satisfies (3), then there exists a A > 0 such that? 

(5) =O exp(— Aa’) and o(— zx) = O exp(— Aa’), ast7> + 
Actually, (5) holds for every fixed A. In order to see this, one merely has 
to combine the proof * for the existence of a sufficiently small A with a known 
device,’ which consists in replacing the sequence 6,,b2,- - - by the sequence 
bws1, >, where N= N(X). 


* Received March 24, 1939. 

1B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta- 
function,” Transactions of the American Mathematical Society, vol. 38 (1935), p. 61. 

* A. Wintner, “ Gaussian distributions and convergent infinite convolutions,” Ameri- 
can Journal of Mathematics, vol. 57 (1935). 

’ A. Wintner, “On analytic convolutions of Bernoulli distributions,” American 
Journal of Mathematics, vol. 56 (1934); “On symmetric Bernoulli convolutions,” 
Bulletin of the American Mathematical Society, vol. 41 (1935). 

*G. H. Hardy, “A theorem concerning Fourier transforms,” Journal of the London 
Mathematical Society, vol. 8 (1933). 


792 


4 
4 


ON SYMMETRIC BERNOULLI CONVOLUTIONS. 793 

In the particular case where A(¢;@) is so small for large | ¢ | as to imply 
the existence of a continuous derivative o’(x), one has 
(6) o’ (a) = Oexp(—Aa*), toto, 
for every fixed A. This follows from (5) by a known argument.® 

Now, (6) implies that there does not exist a convergent Bernoullt con- 
volution (2) whose Fourier-Stieltjes transform (1) is O exp(— 8d?) for a 
sufficiently small § > 0. 

In fact, if there existed a 6 > 0 for a suitable sequence {b;,} satisfying 
(3), then, on choosing A in (6) sufficiently large, one could conclude from 
the theorem of Hardy* that A(¢;o@) is of the form P(t)exp(— at?), where 
P(t) is a polynomial and « a constant. This involves a contradiction, since 
(4) has infinitely many (real) zeros and does not vanish identically. 

3. It will now be shown that the result of Section 2 cannot be essentially 
improved. In fact, it will be shown that there exists lo every positive in- 
creasing function p(t), 0 <t< «, which satisfies the condition 


(7) f P(t) < 0 

1 t° 
a convergent symmetric Bernoulli convolution (2) in such a way that 
(8) A(t;o) = Oexp(—p(|t|)), as o. 


In the proof it may be assumed that p(t) tends with ¢ to - «© ina 
monotonous way, since otherwise we could replace p(t) by p(t) + ¢. 
Now put, for ¢ > 1, 
at 
p(w) 
t) = —— du. 
a(t) 


Then clearly g(¢) is increasing and, since 


Jt 21 


we have 
(9) p(t) =o(t). 
Furthermore, since 
, 

J us 2u* us 
and 

q(t) 1 o(u) , j ¢ 

=o(1 

| du 0(u)du = o(1), 
we have 


5 Cf. B. Jessen and A. Wintner, loc. cit.’, p. 67. 
° B. Jessen and A. Wintner, loc. cit.’, p. 68. 


194 TATSUO KAWATA. 
fo 
(10) f, UO) at < 
Now 
(11) q(3t) —q(t) = te du = p(t) log3= p(t) +A 


for t= t,, where A = log 


Let r(¢) denote the inverse function of g(t), and put ¢(¢) =1/r(t). 


Then we can easily see that ¢(t) > 0. Since 


we have 
N N 
$°(t)dt — —9°(1) —2 tp (t) $’(t) dt 
2 ¢’(t) 
0(1) — ¢*(1) J, 
1/p(N) 
—0(1) —¢?(1) +2 q(u) du. 
1/P(1) 
Thus, 


<&. 


Since ¢7(u) is monotone, it follows that (3) is satisfied by bx = (nA). It 
will be shown that, for these bn, the function (4) satisfies (8). 
Let ¢ > 0. The number of those n which satisfy = c is | 


for the inequality b,t = c is equivalent to ¢(nA) = c/t, i.e., to r(nA) St/c 


orn + q (=) Thus, the number of those n which satisfy 1 > dat = 1/3 is 


[q(3t)/A] — [q(t)/A] = 9(3t)/A —q(t)/A—1 
= (p(t) + A)/A—1= p(t)/A, 
fortt = t). Hence, 
| A(t,o)| =| Il cos(bnt)| =  cos(dnt) 


S (cos 1/3) exp(— p(t)), for t= to. 


Since (4) is an even function, the proof of (8) is complete. 
Finally, I should like to express my hearty thanks to Professor A. Wintner 
for his invaluable criticism and advice. 


ToHOoKU UNIVERSITY, 
SENDAI. 


7 [x] represents the integral part of z. 


tp (1) r(t) =o(1), 
I 
C 
j 
k 
| 
0 
a 
b 
8 
k 
0! 
( 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES.* '! 


By 8S. B. Jackson. 


? 


1. Introduction. The Four-Vertex Theorem or “ Vierscheitelsatz’ 
states that every oval of class C” in the plane possesses at least four extrema 
of the curvature, where an oval may be defined as a simple closed curve with 
non-vanishing curvature.? This theorem has been extended to other classes 
of plane curves by Fog and Graustein* and to certain restricted classes of 
space curves by Siiss, Takasu, and others.* As regards the space curves, the 
results have been very fragmentary, and the curves considered have been 
principally those that are closely enough related to plane curves so that analo- 
gous proofs can be carried over. This is not surprising when one considers 
that the property of closure for a space curve puts a much lighter restriction 
on the curvature than does the same condition for a plane curve, which is 
completely determined by the curvature as a function of the are. Accordingly, 
it seems more reasonable to look for a generalization of the theorem to 
spherical curves, with curvature replaced by geodesic curvature, since a curve 
on the sphere is completely determined by its geodesic curvature as a function 
of the are length. Such a generalization is the object of the present paper. 

By a suitably chosen inversion, any spherical curve can be transformed 
into a plane curve. Under this transformation, it is found (§3) that the 
geodesic vertices of the spherical curve, that is, the extrema of the geodesic 
curvature, are transformed into the vertices of the plane curve. From the 
known results for plane curves * there follows at once the existence of at least 
four geodesic vertices on any simple closed spherical curve of class C””’. 


* Received February 19, 1940. 

1 Presented to the Society, April 8, 1938. 

* First published apparently by Mukhopadhyaya, “ New methods in the geometry 
of a plane are,” Bulletin of the Calcutta Mathematical Society, vol. 1 (1909), pp. 31-37, 
and since then appearing repeatedly in the literature. 

°D. Fog, “Uber den Vierscheitelsatz und seine Verallgemeinerungen,” Sitzwngs- 
berichte der Berlin Akademie der Wissenschaft (1933), pp. 251-254; W. C. Graustein, 
“Extensions of the four-vertex theorem,” Transactions of the American Mathematical 
Society, vol. 41 (1937), pp. 9-23. , 

*W. Siiss, “ Ein Vierscheitelsatz bei geschlossenen Raumkurven,” 7'éhoku Mathe- 
matical Journal, vol. 29 (1928), pp. 359-362; T. Takasu, “ Vierscheitelsatz fiir Raum- 
kurven,” T'éhoku Mathematical Journal, vol. 39 (1934), pp. 292-298. Also a number 
of other papers. W. C. Graustein and S. B. Jackson, “ The four-vertex theorem for a 
certain type of space curves,” Bulletin of the American Mathematical Society, vol. 43 
(1937), pp. 737-741. 

795 


796 B. JACKSON. 


In pushing the results beyond the case of the simple closed curves, a study 
of certain spherical arcs is made, called arcs of type 2 (§ 5) because of their 
shape. These are entirely analogous to Graustein’s arcs of type © in the plane. 
It turns out, in fact, that by a suitable inversion a spherical are of type Q 
may be carried into a plane are of type Q. Thereby the fundamental property 
of the plane arcs of type © is transferred at once to the spherical arcs of type ©. 
This property states that there exists at least one non-negative minimum of 
geodesic curvature interior to any spherical arc of type 2. By means of it the 
Four-Vertex Theorem is extended to a large class of non-simple spherical 
curves. 

In a paper in 1936,° Graustein strengthened the original Four-Vertex 
Theorem. A vertex is called primary if, at the vertex, the curvature is greater 
than or less than the average curvature according as it is a maximum or a 
minimum, and it is shown that the primary vertices outnumber the other 
(secondary) vertices by at least four for every plane oval. We shall establish 
precisely analogous results for a certain class of spherical curves, namely those 
which are tangent indicatrices of other spherical curves (§ 7). The question 
as to whether the strengthened theorem holds for a wider class of spherical 
curves is left open. 

A close relationship is exhibited between the geodesic vertices on the 
tangent indicatrix of a twisted space curve, and the dual vertices defined by 
Takasu * (§8). The relation of the geodesic vertices of a spherical curve to 
the ordinary vertices, that is, the extrema of the ordinary curvature, is also 
clarified (§9). It appears that every geodesic vertex is a vertex, but not 
conversely, whence any spherical curve has at least as many vertices as geo- 


desic vertices. 


2. Transformation of curves by inversion. If C: c= (s) is a regular 
twisted space curve of class C”’, lying on a surface, X, the following well known 


equations are valid: ° 


(2.1) 


5 W. C. Graustein, “A new form of the four-vertex theorem,” Monatshefte fiir 
Mathematik und Physik, Wirtinger Festband (1936), pp. 381-384. 
®*See, for example, W. C. Graustein, Differential Geometry, Macmillan (1935), 


pp. 163-165. 


vy 
ds p 
ita 
Vv 
Ts 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. V9? 


where 1/p, 1/r, and 1/r are, respectively, the geodesic curvature, the normal 
curvature, and the geodesic torsion of C on &, and a, £, v are, respectively, 
the unit tangent vector to C, the unit normal vector to 3, and the unit vector 
tangent to = and orthogonal to C such that (avg) = 1.7 
Let us seek the equations of transformation for the quantities 1/p, 1/r, 

1/r and the curvature 1/R, of C under an inversion in space. If the sphere 
of inversion has radius a and center O, the equation of the inversion in vector 
form is 

where « and 2’ denote the vectors OP and OP’, respectively. From this 
equation it follows that the relation between the elements of arc, ds and ds’, 


(2.2) 


of C and its image C’, respectively, is 


ds’ a" ° 


If § is an arbitrary unit vector localized at the point, P, and & is the 
corresponding unit vector at the inverse point, P’, it is readily shown that 


2(x\8) 
{2.4 = —— 
(2. 4) = (les 
In particular, the vectors «, v, of the trihedral of C on & transform into 
2(a|a) 
2(2\v) 
(2.5) v 
, 


The vectors a and v’ may be viewed as the first two vectors of the trihedral 
of the inverted curve, C’, on the inverted surface, 3’. Since inversion carries 
a right trihedral into a left trihedral and vice versa, it is necessary to take for 
the surface normal to 3’ not 2 but ¢’ =—2Z’ in order to preserve the con- 
vention that the trihedral have the same disposition as the axes. The trihedral 
for C’ on &’ is, therefore, a’, v’, 2’ and equations (2.1) for C’ become 


ds’ 
dv’ 


where the primes denote quantities referred to C’. 


7 For vector notation see Chapter I, loc. cit. 6. 


798 S. B. JACKSON. 


Differentiating the first of (2.5) with respect to s’ and substituting from 
(2.1), (2.3), and (2.6), we obtain the relation 


(zie) 1. , 1, 2(z\v) 1 

(2. 7) 1,244 Mala)’ 
amr 


The inner product of (2.7) with the second of formulas (2.5) yields the 


equation 


(2. 8) 


which represents the desired transformation of the geodesic curvature of ( 
into the geodesic curvature of C’. By differentiation of (2.8) with respect 
to s’ and substitution from (2.1) and (2.3), we find 


df{1\ fds\d fl ds 1 
as p as as p a as T 
as the equation of transformation of the derivative of the geodesic curvature. 
In order to cbtain the corresponding formulas of transformation for the 


normal curvature, it is only necessary to form the inner product of (2.7) with 
=—’. The result is 


1 
(2. 10) 


] 


and differentiation of this relation and use of (2.1) and (2.3) yield the 


equation 


df1\ ds \? d {1 2(alv) ds 1 
ds’ (5) ) ds ) a2 ds’ 


A similar procedure in the case of the geodesic torsion gives the following 


equation of transformation 


(2. 12) ds 1 


T ds’ 


Since 1/R* = 1/p? + 1/7, we have, on squaring and adding (2.8) and (2. 10) 
p 


a? ds’ at 


From (2.1) and the Frenet-Serret formulas, it follows that 


tri 
or. 


p a p a 
| 
| 
t 
( 
| 
a 
P 
fe 
cl 
B_da_v¢ 
ds 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 799 


where @ is the unit principal normal vector for C. Making use of this, together 
with the identity 
(x|x) = (2a)? + + (2|f)? 


we find 
1 4 ds 4(ax|a)* 


as the equation of transformation for the curvature. 
Differentiation of (2.13) and application of the Frenet-Serret formulas 
gives the equation of transformation for the derivative of 1/R, namely: 


ds\? § d 1 ds , 2(zly) 
| ds (=) R ae a” a RT 


where y is the unit binormal vector for C. 

It is to be observed that equations (2.13) and (2.14) are independent 
of the surface, , since they involve only intrinsic properties of the curve. 

As a consequence of the formulas developed above, the following theorem 


may be stated at once. 


THEOREM 2.1. Jf a surface, 3, is carried by inversion into a surface, & 


(a) the extrema of geodesic curvature, not at the center of inversion, 
on the lines of curvature of class C’” of & are carried into the similar extrema 
of geodesic curvature on the corresponding lines of curvature of 3’, points of 
maximum (minimum) geodesic curvature going into points of maximum 


(minimum) geodesic curvature ; ® pe) 


(b) the extrema of normal curvature, not at the center of inversion, on 
the lines of curvature of class C’”’ of & are carried into the similar extrema of 
normal curvature on the corresponding lines of curvature of %’, points of 
maximum (minimum) normal curvature going into points of minimum 


(maximum) normal curvature.® 


The proof is immediate, for the lines of curvature are characterized by 
the fact that 1/7 = 0. Since ds/ds’ £0, it follows by (2.9) that d(1/p)/ds 
and d(1/p’)/ds’ pass through zero together and in the same direction. This 
proves (a), since the extrema of geodesic curvature are characterized by the 
fact that at these points (or arcs) the derivative of the geodesic curvature 
changes sign, and the direction of passing through zero for the derivative 


‘These statements as to exactly what the maximum and minimum points are 
transformed into are valid only by virtue of our agreement regarding the relative 
orientations of = and >’. 


| 
> 
e 


800 S. B. JACKSON. 


determines whether it is a maximum or a minimum point.’ By use of (2.11), 
the proof of (b) follows in a similar manner, except that in this case d(1/r) /ds 
and d(1/1’)/ds’ pass through zero in opposite directions, so that maximum 


points are carried into minimum points and vice versa. 


3. Geodesic vertices on spherical curves. An extremum of geodesic 
curvature will be called a geodesic vertex, and a point (or arc) where the 
geodesic curvature changes sign a geodesic inflection. The term vertex, alone, 
will be used to indicate an extremum of the curvature, 1/R. It is necessary 
to clarify this ambiguous term, curvature, however. For a twisted space curve, 
C’, the curvature is defined as inherently non-negative. For a plane curve, how- 
ever, we shall use the same word, curvature, to denote what is actually the 
geodesic curvature of the curve with respect to the plane. This curvature may 
be either positive or negative depending on the direction of rotation of the 
tangent with reference to the orientation of the plane. 

According to (2.9), geodesic vertices of a curve, C, are preserved under 
inversion provided 1/7 = 0, as was seen in the proof of Theorem 2.1. Special 
interest thus attaches to those surfaces for which 1/r= 0. i. e., for which all 
curves are lines of curvature. It is well known that the only such surfaces 
are the sphere and the plane. Henceforth we shall limit most of our attention 
to such curves. Part (b) of Theorem 2.1 becomes trivial for such curves, but 


part (a) assumes the following form. 


THEOREM 3.1. The geodesic vertices, not at the center of inversion, 
on a plane or spherical curve of class C’”’ are carried by inversion into the 
similar geodesic vertices of the transformed curve, points of maximum (min- 
mum) geodesic curvature béing carried into points of maximum (minimum) 
geodesic curvature.® 

A simple, closed spherical curve, (', of class C’””, may be carried by a 
suitably chosen inversion into a simple closed plane curve, C, of class C”. 
Since every simple closed plane curve of class C’”, not a circle, has at least 
four vertices * we obtain at once the following theorem. 


J 


THEOREM 3.2. A simple closed spherical curve, of class C”’, not a circle, 
has at least four geodesic vertices.’° 


®This proof holds only for isolated extrema, since the first derivative test may 
fail for extrema which are limit points of other extrema. The theorem is valid for 
this type of extrema also, but the proof is omitted as of scant interest for the present 
paper. 
10 This result is incorrectly stated by Fog, loc. cit. 3 in that he states that vertices 
and geodesic vertices coincide on a spherical curve. This is incorrect. (See § 9). 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 801 


4, D-arcs and D-curves. It is necessary to introduce at this point a 
series of lemmas dealing with certain types of spherical arcs and curves. The 
results and methods parallel very closely certain work by Fenchel on spherical 
arcs.*? 

A simple spherical are of class C’ will be called a D-are if (a) it consists 
of a finite succession of ares of class C’” with geodesic curvatures continuous 
clear to their endpoints, and (b) the geodesic curvature is non-negative when 
the arc is suitably directed. A simple closed spherical curve is called a D-curve 
if every sub-are of it is a D-are. Otherwise expressed, a D-curve is a D-are 
that is closed. It follows at once from these definitions that on a D-are or 
D-curve the geodesic curvature is continuous except for a finite number of 
points where one-sided limits exist. In the work that follows we shall consider 
the sphere as oriented by viewing it from the tip of the outward drawn normal. 


Lemma 4.1. In a sufficiently small neighborhood of any point on a 
D-arc, the are lies on or to the left of the directed tangent great circle at this 


point. 


At a point of continuity of 1/p the lemma follows from the definition of 
non-negative geodesic curvature, while at a point of discontinuity of 1/p the 
lemma holds for each of the two arcs class C’’ which meet at this point and 


thus holds here also. 


LeMMA 4.2. Jf a D-are joins two non-diametral points, A and B, of a 
great circle and does not meet it elsewhere, the region (contained in a hemi- 
sphere) bounded by the D-are and the smaller greal circle segment, AB, lies 
to the left of the D-are. 


The proof given by Fenchel*! for an are of continuous non-vanishing 
geodesic curvature holds without alteration in the present case. It may be 
noted, however, that we have assumed A and B non-diametral, whereas Fenchel 


could prove it. 


Lemma 4.3. A D-arc, contained in a hemisphere, joining two diametral 


points, A and B, is a great semicircle. 


Consider the great semicircle APB directed from A to B, where P is any 
point of the D-are. In case P coincides with A(B) we shall mean by APB 
the semicircle from A to B which is tangent to the D-are at A(B). For some 
point (or points) P the D-are lies entirely on or to the right of this great 


11 W. Fenchel, “ Uber Krummung und Windung geschlossene Raumkurven,” Mathe- 
matische Annalen, vol. 10 (1929), pp. 238-252. 


802 S. B. JACKSON. 


semicircle. The points common to the D-arc and this great semicircle APB 
are a closed set, and consist entirely of points of tangency, except perhaps 
for A and B. Moreover, in each case, the tangency must be in the direction 
APB since otherwise the D-arc must either cross APB or cut itself, both of 
which are impossible. If the lemma is false there exists at least one such 
point of tangency in every neighborhood of which there are points of the 
D-arc to the right of APB. But this contradicts Lemma 4.1 and is therefore 
impossible. 


Lemma 4.4. A’ D-arc has at most a finite number of crossings with any 
great circle. 


By a crossing is meant any point or arc common to the D-are and the 
great circle in every neighborhood of which lie points on both sides of the 
great circle. Fenchell?* has proved this lemma for arcs of continuous, non- 
vanishing, geodesic curvature, and this proof extends at once to D-arcs. It 
should be observed that by Lemma 4.1 a tangency cannot be a crossing. 

The remaining essential properties are most readily, obtained by con- 
sidering first the case of D-arcs with non-vanishing geodesic curvature. At a 
point of discontinuity of 1/p we demand also that both the one-sided limits 
shall be different from zero. 


Lemma 4.5. The tangent great circle to a D-arc of non-vanishing geo- 
desic curvature at a point P has no further points of contact with the D-arc 
in a sufficiently small neighborhood of P. 


In general 1/p = d¢/ds where Ad is, to within infinitesimals of higher 
order, the angle between two neighboring tangent great circles. At a point at 
which 1/p is continuous, df > 0 and the arc is actually turning away from 
the tangent, while a point at which 1/p is discontinuous is the junction of 
two arcs, each of which is turning away from the tangent. 


Lemma 4.6. The tangent great circle to a D-curve with non-vanishing 
geodesic curvature, at a point P, has no further points in common with the 


curve. 


The number of points common to the circle and the curve is finite, for 
by Lemma 4. 4 the number of crossings is finite, and by Lemma 4. 5 the closed 
set of tangencies consists only of isolated points and is therefore finite. Assume 
that there are common points, other than P, and let Q be the last such point 
before P. Since 1/p > 0, it follows from Lemma 4.3 that P and Q are not 
diametral. The tangency at P determines a directed great circle arc, PQ. 


( 
| 
( 


WA 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 803 


Let & denote the region, contained in a hemisphere, bounded by the D-arc 
QP and the great circle are PQ. RF and the arc QP are on the same side of 
the great circle arc PQ, and since, by Lemma 4. 5, QP lies to the left of PQ 
at P, Rf lies to the left of PQ. Since PQ and QP are similarly directed at P, 
R lies also to the left of the D-arc QP. It follows readily from Lemma 4. 2 
that PQ is the shorter great circle arc joining P and Q. At P the D-curve 
actually passes into the interior of R, by Lemma 4. 5, and therefore, in order 
to return to Q, it must return to a point of the great circle arc PQ. This is 
impossible, by Lemma 4.2 since the region bounded would be on the right. 
Thus we obtain a contradiction and the lemma is proved. 

Fenchell ** obtained the following result, restated here only for convenience. 


Lemma 4.7%. Jf Ais anarc of class C’’ on a surface of positive Gaussian 
Curvature, a similarly directed geodesic parallel to A contained in the field of 
geodesics perpendicular to A and lying to the left of A has greater (algebraic) 
geodesic curvature than A at corresponding points. 


Since by this lemma a geodesic parallel to a D-curve which lies sufficiently 
near it and to its left is surely a D-curve we are led at once to: 


Lemma 4.8. The geodesic parallels to a D-curve that lie sufficiently near 
it and to its left are D-curves of non-vanishing geodesic curvature. 


Lemma 4.9. A tangent great circle to a D-curve cannot cross the curve. 


Since, by Lemma 4.1, a tangency cannot be a crossing, the curve and a 
great circle meet at some angle, not zero, at a crossing. Suppose there exists 
a tangent great circle that crosses the curve. If the D-curve is deformed to 
its left into an arbitrarily near geodesic parallel, the crossing points and the 
tangent great circle deform continuously, and the geodesic parallel has a 
crossing with a tangent great circle. Since this contradicts Lemma 4. 6, the 
assumption is false and the lemma is proved. 

It is clear from this lemma that every D-curve is contained in a closed 
hemisphere, which leads to the following result. 


Lemma 4.10. A D-curve containing two diametral points 1s a great 


circle. 


Since the entire curve, and hence each of the arcs into which the diametral 
points divide it, is contained in a hemisphere, it follows by Lemma 4. 3 that 
each arc is a great semicircle. The conclusion then follows by the continuity 
of the tangent. 


804 S. B. JACKSON. 


It can be shown by further discussion that the D-curves have the character 
of ovals on the sphere. In particular, a D-curve, not a great circle, has in 
common with any tangent great circle either a single point or a single are, 
less than a semicircle. Since this property is not essential for our work, the 
details of the discussion will be omitted. 


5. Arcs of type 2. Graustein * has developed a theory of certain plane 
arcs which he has called arcs of type Q. We shall consider analogous spherical 
arcs which will also be designated as of type Q. 

A spherical are, AB, of class C””’, is said to be of type © if (a) its geodesic 
curvature, when it is traced from A to B, is non-negative and is not identically 
zero; (b) the tangent great circles at A and B coincide; (c) the arc meets this 
common tangent only at A and B; and (d) it is simple except that B may 
coincide with A. Condition (c) is not actuaily necessary as a part of the 
definition since it essentially follows from the other three conditions and the 
work of § 4. However. it is convenient and we shall retain it. 

An arc of type Q, which may be designated without ambiguity by Q, is 
clearly a D-arc. Moreover, it is tangent to the common tangent great circle 
in the same direction at A and B, since otherwise Lemma 4.1 would be vio- 
lated at one point or the other. By adjoining to Q the great circle are BA, 
directed in the sense induced by Q, there arises a D-curve, 2, with discon- 
tinuities in the geodesic curvature at A and B. Since Q is not a great circle 
arc, 2 is not a great circle, and by Lemma 4. 10 the are BA is less than a semi- 
circle. From this discussion and Lemma 4.9 we have at once the following 
result. 


Lemma 5.1. The closure, ©, of an arc of type Q lies on one side of every 
tangent great circle. The common tangent great circle at A and B has just 
the contact arc BA, less than a semicircle, in common with 9. 


Consider the point .W’ diametrically opposite to a point M on the great 
circle arc, BA, of O. Let M’, which by Lemma 5. 1 does not lie on ©, be chosen 
as center of stereographic projection. The great circle containing the are BA 
goes into a straight line, and 2 goes into an arc 9’, lying on one side of this 
line and tangent to it at the projected points A’ and B’. Consider the great 
circle K tangent to Q at any point other than A or B. The point WM’ lies on 
one side of K, while MW, and with it all of , lies on the other side by Lemma 
5.1, since the circle K by hypothesis is not the common tangent great circle 
at A and B. Thus K projects into a circle K’ with ’ in its interior. Since, 
at the point of tangency, 2’ must have at least as great curvature as K’, 0’ has 


i 
i 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 805 


non-vanishing curvature, except perhaps at A’ and B’. It is seen at once that 
0’ is a plane arc of type 2 as defined by Graustein.® 

The direction on a plane (spherical) are of type Q for which the (geo- 
desic) curvature is non-negative will be called the positive direction. It is 
readily shown that the positive directions on 2 and ©’ correspond. When Q 
is traced so that 1/p = 0, v is directed toward the interior of 0, that is, toward 
the smaller of the two simply connected regions into which © divides the sphere. 
Since M’ is not in this interior, the interior of 2 projects into the interior of 
Q’, and v projects into the vector v’ directed toward the interior of 0’. But 
for 2’ the first of formulas (2.6) becomes da’/ds’ = v’/R’, This shows that 
1/R’ = 0 when v’ is directed toward the interior of 9’, and the positive 
directions on © and 9’ correspond. We have therefore established 


LemMaA 5.2. By a suilable inversion, a spherical are of type Q can be 
transformed into a plane are of type Q, so that the positive directions on the 


two arcs correspond.** 


From Lemma 5.2, Theorem 3.1, and Graustein’s theorem * that the non- 
negative curvature of a plane are of type Q has at least one minimum interior 
to the arc or is constant throughout the arc, we have at once the following 


theorem. 


THEOREM 5.1. A spherical arc of type Q has a minimum of non-negative 
geodesic curvature interior to the arc, or has constant geodesic curvature 


throughout the are. 


This leads readily to a second result, since an are of type Q with constant 


geodesic curvature is a circle. 


THEOREM 5.2. A closed spherical curve of class C”’ which has geodesic 
inflections and contains an are of type Q, not a circle, has at least four geodesic 


vertices. 


If the curve is directed so that on the are of type 2, 1/p = 0, it follows 
from Theorem 5.1 that there exists at least one non-negative minimum of 1/p. 
Since there are geodesic inflections, 1/p becomes negative, and there must also 


12 This lemma is established only by virtue of the relative orientations of 2 and ~’ 
agreed on when we derived formulas (2.6). In the present case it implies that if the 
sphere is oriented by the outward drawn normals, the plane is oriented by the normals 
directed away from the sphere. 


= 


806 S. B. JACKSON. 


exist a negative minimum. It follows that there must also be two maxima, 
and thus at least four vertices. 


6. Extrema of the torsion of spherical curves. The torsion of a curve 
C on a surface & is readily found to be 


(6. 1) 


by replacing the angle which enters into Bonnet’s formula, 1/T = d¢/ds +- 1/r, 
by its value in terms of 1/r and 1/p. For a spherical curve on a sphere of 


radius b, 1/r = —1/b, d(1/r)/ds = 0, 1/r = 0, and (6.1) reduces to 
(6.2) 


Since the geodesic vertices are the points where d(1/p)/ds changes sign, 


we obtain the following result directly from (6.2). 


LEMMA 6.1. Ona spherical curve of class C”’ the geodesic vertices are 
precisely the points where the torsion changes sign. 


Let us call a point (or arc) where 1/7’ changes sign a transition of the 
torsion. It has been proved by Fenchel for a closed spherical curve of class 


C’” that J ds/T =0.1* Hence, for a closed spherical curve, a transition of 
Cc 


the torsion is, in reality, a point at which the torsion crosses its average value. 
A maximum (minimum) point of the torsion on a closed spherical curve where 
1/T >0 (<0) will be called a primary extremum. Other extrema will be 
termed secondary.** 


Lemma 6.2. If C is any closed spherical curve of class C’’, pr = sr + 9, 
where pr and sy are, respectively, the numbers of primary and secondary ex- 


18, W. Fenchel, “Uber einen Jacobischen Satz der Kurventheorie,” 7éhoku Mathe- 
matical Journal, vol. 39 (1934), pp. 95-97. This result also follows by integrating (6.2). 

14Compare W. C. Graustein, loc. cit. 5. Also W. C. Graustein and S. B. Jackson, 
loc. cit. 4. 


df 

T 

| 
| 
| 
| 
i 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 807 


trema of the torsion, and g is the number of geodesic vertices on C. It is 
understood that if sr or g is infinite the equation merely implies that pr is also 
infinite.'* 


By Lemma 6.1, g represents the number of transitions of 1/7’ on C. 
If 1/7’ =0, all three quantities are zero and the formula is trivially valid. 
In the contrary case there exist at least two transitions. Between two con- 
secutive transitions, the primary extrema outnumber the secondary by one or 
both are infinite, for on this arc the types of extrema alternate, and the first 
and last are primary. Finally if g is infinite then pr is also infinite, since 
between two transitions there is at least one primary extremum. Thus the 
lemma holds in every case. 

The following theorem is a direct consequence of Lemma 6,2 and 
Theorem 3. 2. 


THEOREM 6.1. The number of primary extrema of the torsion on a 
simple closed spherical curve of class C”’, not a circle, exceeds the number of 
secondary extrema by at least four, or both are infinite.¥ 


7. Geodesic vertices on tangent indicatrices of spherical curves. If 
C is a closed spherical curve of length /, a maximum (minimum) point of 1/p 
at which 1/p — 1/a > 0 (< 0) will be called a primary geodesic vertex, where 
1/a = (1/1) J. ds/p; i.e. 1/a is the average value of 1/p taken over C. All 

other geodesic vertices will be termed secondary. A point (or arc) where 
1/p—1/a changes sign will be called a transition of 1/p. Precisely as in 
Lemma 6. 2, it can be shown that 


(7.1) p=s+t 


where p, s, and ¢ are respectively, the numbers of primary and secondary geo- 
desic vertices, and the number of transitions of 1/p on C. 
It has been shown by Fenchel ** that for a regular space curve Cy 


(7.2) 


15 The existence of at least four transitions of the torsion for any simple closed 
curve on an ovaloid was proved by H. Mohrmann, “ Die Minimalzahl der stationiren 
Ebenen eines riiumlichen Ovals.” Sitz. der Kéniglich Bayerischen Akad. der Wissen- 
schaften, Math.-Phys. Klasse, Miinchen (1917), pp. 51-53. This might have been used 
in conjunction with Lemma 6.1 to establish Theorem 3. 2. 


| 
| 
1 
Ro p 


808 S. B. JACKSON. 


where 1/p is the geodesic curvature of the tangent indicatrix, or, equivalently, 
dso/T, = ds/p, where 8) and gs are the are lengths on C, and its tangent 
indicatrix, C, respectively. If, in particular, Co is itself a closed spherical 


curve, fl ds/p = f ds,/T » = 0, as was noted in § 6, whence it follows that a 
Cc 
necessary condition for a closed spherical curve, C, to be the tangent indicatrix 
of some other closed spherical curve, Co, is that f ds/p = 0; i.e. 1/a=0. 

Since, for a spherical curve Co, 1/fi, ~0, it follows that in this case 
1/p and 1/7, have corresponding transitions, and ¢, in (7.1), equals the 
number of transitions of 1/7, which, by Lemma 6. 2, equals the number of 
geodesic vertices on Cy. Thus we have proved the following theorem. 


THEOREM 7.1. If Co is a closed spherical curve of class C”’, and C is its 
tangent indicatriz, then 
P=S+ Yo 


where p and s are the numbers of primary and secondary geodesic vertices 


on C, respectively, and go is the number of geodesic vertices on Co. 


Let Cy be a closed spherical curve of class C”*? and corisider the sequence 
of closed spherical curves Ci, i= 1,---,n, such that C; is the tangent in- 
dicatrix of Ci_,. Ci; will be called the i-th tangent indicatrix of Co. It may 
be noted that C, and C, are the ordinary tangent and normal indicatrices of C). 
The last theorem may now be generalized in the following way. 


THeorREM 7.2. If Cy is a closed spherical curve of class C"** and Ci 
is its i-th tangent indicatriz, i=1,- -,n, then 


r-1 


Pr=Sr +23 81+ Go rin 


where p; and s; are, respectively, the numbers of primary and secondary geo- 
desic vertices on Cj, and go is the number of geodesic vertices on Co. 


The proof is by induction on r. For r= 1 the above contention is pre- 
t-1 

cisely Theorem 7.1. If the theorem is true for r= ft, then p; = 8: + 2 3 8i + 9o, 
i=1 


t 
and the number of geodesic vertices on C; is pt + 2 S sit go. Since Cr 
4=1 


is the tangent indicatrix of C;, it follows by Theorem 7.1 that prs. = Stu 


t 
+ 2% si + and the induction is complete. 
i=1 


= 

| 

| 


)- 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 809 


COROLLARY 7.2.1. The k-th tangent indicatrix of a closed spherical 
curve, Co, contains at least as many geodesic vertices as Co. The numbers are 
equal if and only if the first k: tangent indicatrices have only primary vertices. 


k 
The proof is immediate, for px + sx = 22s $i + go = Jo, and the equality 


sign holds if and only if s; =0,i=—1 


COROLLARY 7.2.2. The number of primary geodesic vertices on a tangent 
indicatrix of any order of a simple closed spherical curve, not a circle, exceeds 


the number of secondary geodesic vertices by at least four. 


For = 2 + Go = go, and by Theorem 3.2, go = 4. It is clear, 


by this last corollary, that a necessary condition that a closed spherical curve, C, 
be a tangent indicatrix of a simple closed spherical curve, not a circle, is that 
p—s=/=4; that is, the curve must contain at least four geodesic inflections. 
Thus a figure eight curve with only two geodesic inflections cannot possibly be a 
tangent indicatrix of a simple closed spherical curve. 

By suitably combining the relationships discussed here, it would be 
possible to state several interesting theorems. One illustration will suffice. 

THEOREM 7.3. Jf Cy and Cs are two mutually inverse spherical curves, 
with first tangent indicatrices C, and C., respectively, then py — 5, = pe — So, 
where pi, and §; are, respectively, the numbers of primary and secondary 
geodesic vertices on Cj, i= 1, 2. 

For pi —Si=gi, where gj is the number of geodesic vertices on Ci, 


i= 1,2, and by Theorem 3.1, g; = g>. 


8. Dual vertices. \s a consequence of formula (7.2) there exists a 
very simple relationship between the geodesic vertices on the first tangent 
indicatrix of a closed regular space curve, C, and the dual vertices which are 


defined by Takasu 7° as the extrema of the dual curvature, 1/P =—T/R. 
By (7.2) 1/py = R/T, where 1/py is the geodesic curvature of the first tangent 
indicatrix, whence it follows that 1/? =—py. If 1/py~O, then 1/P 


continuous and the dual vertices of C correspond exactly to the geodesic ver- 
tices of the tangent indicatrix. In the contrary case, however, 1/P becomes 


infinite. This occurs whenever 1/7’ becomes zero. 


16'T., Takasu, loc. cit. 4. 
8 


t 
a 
e 
e 
e 
) 
)e 
i 


810 S. B. JACKSON. 


If, in particular, C is a closed spherical curve, not a circle, it follows from 
(6.2) that 1/7’ passes through zero at least twice, and 1/P cannot be con- 
tinuous. This fact is apparently overlooked by Takasu in proving his dual 
four-vertex theorem for spherical ovals. He makes use of the continuity of 
1/P to establish the existence of at least two zeros for its derivative. The 
proof in question is thus invalid, but curiously the theorem itself is true, as 
follows at once by Corollary 7. 2.2 and the relation 1/P = — py. 

Takasu observes the relationship indicated above between a curve and its 
tangent indicatrix, but states it incorrectly as a correspondence of dual vertices 
of the curve and vertices of the tangent indicatrix instead of geodesic vertices. 
As we shall see in the next paragraph, not all vertices are geodesic vertices. 
As a matter of fact, the vertices of the tangent indicatrix which are not also 
geodesic vertices are the points that correspond to the discontinuities of 1/P 
instead of its extrema. 


9. Vertices of spherical curves. In concluding this study of spherical 
curves, it is natural to inquire what relationship exists between the vertices and 
geodesic vertices of such a curve, and what can be said regarding the number 
of vertices. If the curve, C, lies on a sphere of radius b, then 


1 d/l 1 

The points of C at which d(1/R)/ds, 1/p, and d(1/p)/ds change sign are, 
respectively, the vertices, the geodesic inflections, and the geodesic vertices. 
Since, for a spherical curve, 1/R #0, the left side of (9.1) changes sign 
precisely at the vertices. Similarly, since a geodesic inflection never coincides 
with a geodesic vertex, the right side of (9.1) changes sign precisely at the 
geodesic inflections and the geodesic vertices. This leads at once to 


THEOREM 9.1. The vertices of a spherical curve of class C”’ consist of 
the geodesic vertices and geodesic inflections. 


17 This theorem is true for curves of class C”, but our argument, based on (9.1) 


assumes class 


Differentiation of this equation yields 

| 

| 

| 

| 

| 


THE FOUR-VERTEX THEOREM FOR SPHERICAL CURVES. 811 


From this theorem it can readily be shown that vertices are not preserved 
by inversion, even for the special case of stereographic projection. Let a circle 
be tangent to an ellipse at a vertex and cut it in two other points. If a sphere 
is chosen on which the above circle projects stereographically into a great 
circle, the projection of the ellipse has geodesic inflections, for otherwise it 
would be a D-curve, and by Lemma 4.9 could not cross a tangent great circle. 
Thus the projection has at least six vertices, four geodesic vertices and at least 
two geodesic inflections, as compared with four vertices on the ellipse. 

The statements in the last paragraph are not contrary to (2.14) for 
although for a plane curve 1/7’=0 and thus d(1/R’)/ds’ is a multiple 
of d(1/k)/ds, the other factor, (1/R)ds/ds’ + 2(2|B)/b*?, may become 
zero, Whence it follows that d(1/R’)/ds’ may change sign, even though 
d(1/R) /ds 


Theorem 9. 1 may be conveniently restated in the following form. 
Corotiary 9.1.1. If C is any spherical curve of class C”’, 
vegti 


where v, g, and i are, respectively, the numbers of vertices, geodesic vertices, 
and geodesic inflections on C. 


From this follow readily several interesting corollaries. 


CoroLtiary 9.1.2. A simple closed spherical curve of class C’’, not a 
circle, has at least four vertices. 

For by Theorem 3. 2, g = 4. 

A D-curve with continuous geodesic curvature is called an oval. Using 


this definition, we state 


Corotiary 9.1.3. A simple closed spherical curve of class C0’, not an 


oval, has at least six vertices. 


Since, by hypothesis, there are geodesic inflections, and these necessarily 


occur in pairs, 1 = 2, while by Theorem 3.2, g = 4. 


Corotntary 9.1.4. A closed spherical curve of class C’’ which contains 
geodesic inflections has at least four vertices. 


S. B. JACKSON. 


For by hypothesis i = 2 and g = 2 on any closed curve of class C”. 


CoroLtary 9.1.5. A closed spherical curve of class C’” which is a 
tangent indicatrix of any order of a closed spherical curve, not a circle, has 


al least four vertices. 
By hypothesis 1/p 540, and a necessary condition that a curve be a tan- 


gent indicatrix of a closed spherical curve is that f. ds/p=0. Hence 1/p 
Cc 
changes sign and Corollary 9. 1. 4 applies. 


CoroLLary 9.1.6. A closed spherical curve of class C’”’ which 1s a tan- 
gent indicatrix of any order of a simple closed spherical curve, not a circle, 
has at least six vertices. 


As in the last corollary 1 = 2, and by Corollary 7. 2.2, g = 4. 


THE UNIVERSITY OF WISCONSIN. 


812 

( 
i \ 
| | 
3 
I 


A COMPLETE CHARACTERIZATION OF SECTIONAL FAMILIES 
OF CURVES.* } 


By ANNETTE VASSELL. 


The object of this paper is to study the geometric character of a special 
type of family of plane curves, the sectional family. A sectional family is 
obtained by projecting from a fixed point upon a fixed plane all the plane 
sections of an arbitrary surface. A set of six plane geometrical properties is 
found for these families and it is proved that they are characteristic. This 
problem was first considered by Kasner * in 1908 and the solution is analogous 
to his differential-geometric characterization of dynamical trajectories.’ Of 
the individual properties mentioned below, I is due to Kasner, as also II, III 
and III’ for the case of developable surfaces. Moreover IV and V were sug- 
gested by his properties V and VI for dynamical trajectories. 


By making use of a projective transformation in space which leaves every 
point of the fixed plane invariant and carries the center of projection to the 
point at infinity in the direction orthogonal to the plane, it is readily seen that 
a given sectional family obtained by central projection from one surface can 
always be thought of as obtained by orthogonal projection from another surface 
projectively related to the first. Let the fixed plane be the z,y plane, let 
z= f(z, y) be the equation of a surface and let z=ax + by + ¢ be the equa- 
tion of a general cutting plane. Projecting the plane sections orthogonally 
upon the x,y plane we get as the equation of the resulting family of curves 


ax +- by +c—f(a,y) =0. 
A sectional family is thus a certain kind of three-parameter system of plane 
curves, 
By differentiating and eliminating the constants from the last equation 
we find the differential equation of the svstem of curves to be 


(1) (fer + 2fovy’ + fo?) 
= (frre + 3frcyy’ + + + 3 (feu + fu’) 


* Received July 27, 1939; revised January 8, 1940. 

? Abstract in Bulletin of the American Mathematical Society, vol. 45 (1939), p. 91. 

* Abstracts in Bulletin of the American Mathematical Society, vol. 14 (1908), p. 
356; vol. 36 (1930), p. 51. 

*“ The Trajectories of Dynamics,” Transactions of the American Mathematical 
Society, vol. 7 (1906), pp. 401-424. Also “ Differential-geometric Aspects of Dynamics,” 
Princeton Colloquium Lectures on Mathematics (1913; new edition 1934). 


813 


814 ANNETTE VASSELL. 


This is of the general type * 
(2) y= G(a,y,y')y” + H(2,y, 


Kasner proved that all triply infinite families whose differential equation is 
of the form (2) where G and H are any functions of z, y, y’ have the following 
geometrical property.° 


Property I. If to each of the «+ curves having a given lineal element in 
common the osculating parabola is drawn at that element, the foci will lie on 
a circle through the point of the element. | 

And conversely, every system of curves possessing Property I is defined by 
a differential equation of the form (2). As shown in the reference, the focal 
circle corresponding to a lineal element 2, y, y’ has the equation 


(3) 2G(X? + ¥*) + (3(¥?—1) + 1) 
+ {(y? + 1)H —6y}¥ =0, 


where X, Y denote current codrdinates referred to axes drawn through the 


given point as origin and parallel to the z- and y-axes respectively. 

The special form of the coefficients G and H in the sectional] case indicates 
that sectional families possess other properties besides Property I. We observe 
that H has the form 
(4) — (w, /2 | 

(y — w1) — 
where w,,w. are the roots of the equation + 2feyy’ + = 0 and are 
therefore the projections of the asymptotic directions on the surface. We have 


(5) W, + = — = fro/fyy- 


We shall show that if w,,w. in (4) are any two functions of x and y, 


whether derived from a surface or not, the following property will hold and 
conversely that if this property holds then H must be of the form (4). This 
property was stated without proof by G. Comenetz.® 


Property II. There exist for each point x, y of the plane two directions 
w,, W, such that any direction y’ and the reflection in y’ of the tangent to the 
focal circle determined by z, y, y’ are pairs in the involution whose fixed 
directions are w, and w». 


‘E. Kasner, “ Dynamical Trajectories and the ©* Plane Sections of a Surface,” 
Proceedings of the National Academy of Sciences, vol. 17 (1931), pp. 370-376. 

5“ The Trajectories of Dynamics,” p. 409. 

®*“ Curvature Trajectories,” American Journal of Mathematics, vol. 58 (1936), p. 
225. 


| 
| 
| 
i 


CHARACTERIZATION OF SECTIONAL FAMILIES OF CURVES. 815 


The slope of the tangent to the focal circle (3) at the given point is 


—1) —y'(¥7 ++1)H 
6y — (y?+1)H 


It is easily computed that the reflection in y’ of this slope is y’—3/H. The 


necessary and sufficient condition * for Property I1 is that 


y' (y’ — 3/H) —4(w; + w2)[y’ + (y — 3/H)] + = 0, 


and this equation is equivalent to (4). 

When w, = w. (we then write them as w), the involution is singular so 
that the reflection in 7’ of the tangent to the focal ‘circle is fixed and coincides 
with 

Next we note that with the use of (5), G@ of (1) may be written as 


+ my? + ny’ +k 
G= 
(y — (y — W2) 


(6) 
where 


(7) h=fyy/fy, m= 3f cyy/, f yw: VS 3f f w, k= f f 


Moreover, when w, = We, it may be verified that the numerator of (6) has 
the factor (y’ — w), that is 


(8) hw* + mw? + nw +h = 0. 


We shall prove that if h, m, n, kh, w,, w., are any functions of z and y 
subject only to the condition that if w, = we, (8) holds, the following property 
will be true of systems (2) with H given by (4), and conversely that if this 


property holds then G must be of the form (6) subject to (8) when w, = w2. 


Property III. In each direction through a given point O there passes. 
one curve which has contact of third order with its circle of curvature. When 


the directions w,, w. occurring in II are distinct, the locus of the centers of 


the «! hyperosculating circles, obtained by varying the initial direction, is a 
cubic having a rectangular node at the given point O. The nodal tangents 
bisect ® the angles made by the directions w,, wo. When w, = we, the locus is 
a conic which passes through the given point in the direction w. 

The condition of third order contact demands that y’” of the differential 
be the same as y” of the system (2). 


equation of circles (1 + y’” = 3y’y 
Equating the two expressions for y’” and then solving for y’’, we find 

7 Graustein, “ Introduction to Higher Geometry ” (1930), p. 155, ex. 9. 

‘When w,, w, are the two isotropic directions, the cubic degenerates to three 
straight lines through the given point (if also @ =0, it degenerates into the whole 
plane), and there is no rectangular node or angle bisection. 


816 ANNETTE VASSELL. 
G(1i+ 
—HM(1+y*) + 3y’ 


Substituting for // from (4) and for G from (6), we have 


— (m1 + — 2 (wis — 1)y’ — we) 


This shows that to any «, y, y’ there is one y”. Hence to any lineal element 


(9) yf = 


there corresponds one curve which is hyperosculated by its circle of curvature. 
The coédrdinates of the center of curvature for a curve at a point, with 
respect to that point as origin, and axes parallel to the codrdinate axes, are 


Solving for y’ and y” in (11) and substituting the values in (10) we have 


(12) AX*—mN7Y + 
— %(w, + — 3(wyw, — 1) A (w, + ws) V2 = 0. 


This is a cubic with a node at the given point. We observe that when we set 
the quadratic terms equal to zero, the product of the roots Y/X is —1 and 
hence the tangents at the node are perpendicular to each other. It follows 


from the identity 
(w, + wz) (Wiw2) — — 1) + Wz) — = 


that w,, w, are harmonic conjugates with respect to the roots of the quadratic, 
9 


hence the roots, that is the nodal tangents, are the angle bisectors ‘ 
When w, = uw, an extraneous factor \ + wl must be removed from (12) 


Of Wy, We. 


and we obtain a conic having the w direction at the given point. 
Conversely, we now ask for all systems (2) having Properties II and II]. 
The equation of a cubic having a rectangular node at the given point is 


(18) + 4+ 38A,.XY? + A, Y* + + — = 


where the coefficients A are functions of v7, y. The center of a hyperosculating 
circle is given by (11) where y” is defined by (9). If we substitute the center 
in (13), and apply Property II to the result by substituting for // from (4) 
and then solve for G, we find 
— + — A) [(ws + 
2(wiw,—1)y’ — we) 
— (y — — Asy’ — As) 


(14) G= 


® Graustein, p. 155, ex. 10 and p. 152, Theorem 1. When the involution is circular, 
w,, w, are the isotropic directions and there is no question of bisection. 


| 
| 


CHARACTERIZATION OF SECTIONAL FAMILIES OF CURVES. 817 


This expression for G becomes simplified when we make use of the fact that the 
nodal tangents of the cubic bisect the angles formed by the lines Y = wX, 
Y =w.A. The nodal tangents are defined by setting the quadratic terms of 
(13) equal to zero. As before we find that the condition for bisection is 


= + /2 (ww. —1). 


Substituting in (14), and changing the notation somewhat we find that @ 
simplifies to the form (6) where /, m,n, i are arbitrary in a, y. 

Similarly we deal with the case w, = w.. 

We have now proved that a necessary and sufficient condition that a svstem 
of curves have an equation of the type (2) with /7 and G@ of the forms (4) 
and (6) respectively, that is, 


= + my? + ny +h)y’ +3 (v 


where h, m,n, i. w, and wy are arbitrary functions of «ey (except for (8) 
when w, = wy.) is that it possess Properties I, LI, IIT. 

Property V of Wasner’s set for dynamical trajectories states a relation 
between the radii of curvature of the trajectory in the w direction at a given 
point which hyperosculates its circle of curvature. and of the line of force 
(the line y’=w) passing through the point. This suggests a similar in- 
vestigation in the case of sectional families. 

We shall prove the following property. 


Properly IV. Let the directions w,; and w. of Property IT be distinct. 
Of the curves which pass through a given point in the direction w, there is 
one which has contact of third order with its circle of curvature; the radius 
of curvature of this curve is 3/2 the radius of curvature of that one of the 
integral curves of the direction w, which passes through the point. A similar 
statement holds for the direction w.. When mw, = w., the above statement is 
replaced by the following: the curve in the w direction which has contact of 
third order with its circle of curvature has the curvature zero.'° 

Let w,~ w.. To find the radius of curvature of the curve in the w, 
direction which hyperosculates its circle of curvature we substitute in the 
formula for radius of curvature y” as determined by (10) and w, for 7’. 
We observe that this radius of curvature is, by definition of the cubic (12), 


9°Tn this case the integral curve of the direction w also has curvature zero and 
thus the 3/2 ratio still holds, but it is not necessary to mention this in order to have 
a characteristic set of properties. On the other hand merely to say that the 3/2 ratio 
holds is not sufficient for a characteristic set. 


= 


818 ANNETTE VASSELL. 


identical with the segment which the normal to the w, direction at O intercepts 
on the cubic. Let us call the point of intersection of this normal with the 
cubic N,. Then 

‘ 12)3/2(w, — w, 

(16) 3(1 + w,) 


2(hw,* mw,? + nw, +k) 


The radius of curvature p, of the curve y’ = w, is 


2)3/2 

The necessary and sufficient condition for ON, to be 3p,/2 is 

(18,) hw,*? + mw,? + nw, + k = (we— w,) (Wiz + WiWy). 
Similarly the condition corresponding to the w. direction js 


(18.) + + + k = (w, — We) (Woz + WoW ry). 


When w, = w., the denominator of (10) has the simple factor y’ — w. 
In order for y”, and hence the curvature corresponding to the w direction, to 
vanish when y’ = w the cubic in the numerator of (10) must have (y’ — w)? 
as a factor. In view of (8), the necessary and sufficient condition for this is 


(19) 3hw? + 2mw+n=—0. 


It may be verified that sectional families which have w, ~ wz obey (18;) 
and (18,) and those for which w, = w, obey (19). Hence sectional families 
have Property IV. 

We obtain the remaining properties by differentiating and combining in 
various ways the relations (5) and (7). We omit the calculations and merely 
state the results, which can be verified from (5) and (7). The first relation 


found is 
k — (w,wz) 
(20) [ W,We 


We shall prove that (20) is the necessary and sufficient condition for the 


next property. 


Property V. When the point O is moved, the associated cubic referred to 
in III changes in the following manner. Take any two fixed perpendicular 
directions for the x direction and the y direction; through O draw lines in 
these directions meeting the cubic again at A and B respectively. Also con- 
struct the normals to w,; and w, at O. At A draw a line in the y direction 
meeting these normals in some points A’ and A”, and at B draw a line in the 


to 


Pri 


( 
t 
a 
a 
( 
‘ t 
| N 
A 
A 
st 
| ( 
(2 
(2 
Tl 
|_| 


CHARACTERIZATION OF SECTIONAL FAMILIES OF CURVES. 819 


z direction meeting the normals in some points B’ and B” respectively. The 
variation property referred to takes the form 


1 1 l _ x 


where AA’, AA”, BB’, BB” are signed distances and where w,, w. denote 
the slopes of the directions referred to in II relative to the chosen z-direction. 
This is true for any pair of orthogonal directions, and therefore really expresses 
an intrinsic property of the system of curves. 

To establish the above statement, we substitute in (20) the values of h 
and k from 


25 OA = 


these latter being the intercepts on the codrdinate axes of the cubic (12). On 
simplifving the result somewhat, we find 


Now if we carry out the construction as expressed in V we find from triangles 
AA’O and BB’O that — OA = w,AA’ and — BB’ = w,OB, and from triangles 
AA”O and BB’O that —OA =—w.AA” and — BB” =w.O0B. On sub- 
stituting these in the above equation, we obtain (21). 


The final property depends on the following two relations. 


Wi Ws I: 21, We 
+ We w, + we w+ (w, + w.)? 


(24) Ww.) (w, + wo)h + 2k = 2(wywes) — (Wwe) (W, + we) y. 


If we now use the relations (22) in (23) and (24) we obtain 


(25) OA OB + % Ww, + we —— Ws + ws 


Property VI. Let the intercepts OA and OB be constructed as in V. 
The variation of the directions w, and w. as the point O is moved is related 
to these intercepts by the equations (25) and (26). 

Obviously (23) and (24) are the necessary and sufficient conditions for 


Property VI. 


820 ANNETTE VASSELL. 


We shall now prove that Properties I-VI are sufficient for sectional fami- 
lies. We have already seen that any system of curves having Properties I, II 
and IIT will have an equation of the form (15). Our problem is to show that 
if we apply (18;), (18.) (or if w; = w., (8), (19)), (21), (25) and (26) toa 
system of curves (15), the curves will be the projections of the plane sections 
of a surface; that is, that w,, we, 4, m,n, k will be of the forms (5) and (7), 
where f is some function of x and y. Instead of (21), (25) and (26) we may 
use (20). (23), and (24), which are equivalent to them. 

Equation (20) is an integrability condition; hence a function F(z, y) 
exists which satisfies the equations 
— (wyws), 

W U's 


(27) 


If we place the expressions for / and i from (27) in (23) and multiply 


the result by e” we have 
= — + we) 
Therefore a function 7 exists such that 
28) efwiw, =H, and + w.) = Ay. 


Now if we substitute (27) and (28) in (24) so as to eliminate h, k, w, and w», 
and simplify, we find that Z/,, = (e”),. This equation means that a function 
g exists for which H,=—g, and e’ =g,. The first of these two equations 
further says that a function f exists for which H =f, and g=fy,. We have 
then 

(29) ° ef m=fy and 


Consequently w,,w. obey (5). Next we substitute (5) and (29) in the 
equations (18,), (182) (or if w,;—=we, in (8), (19)) and (27) and solve 
them for h, m,n, k. We find that they satisfv (7). 

We have now proved that every sectional family possesses Properties I-VI 
and every family of curves possessing Properties I-VI 1s a sectional family. 


Developable surfaces. On a developable surface the asymptotic directions 
coincide, hence w,;=w,. Therefore a sectional family derived from a (le- 
velopable surface, which we may call a developable system, has Properties I-V! 
in the simpler form to which they reduce when w, = wy. 

Conversely, every family of curves having Properties I-VI in the reduced 
form is a developable system. Thus the reduced set of six properties is 4 
characteristic set for developable systems. In this case G in (2) is linear in 


y and H is 3/(y’ —w). 


CHARACTERIZATION OF SECTIONAL FAMILIES OF CURVES. 821 


When w, is set equal to we, (21) in Property V becomes 


1 0 1 os 

and Property II says that y’ bisects the angle between w and the reflection in 
y’ of the tangent to the focal circle. For developable systems one asymptote 
of the conic in Property III is perpendicular to the w direction. 

Now a given type of family may be characterized by more than one set of 
properties. In the case of developable systems it is possible to replace VI by 
the following more elegant pair of properties. 


(A) The integral curves of the direction w are straight lines. 


(B) Let « be the angle between the asymptotes to the conic of IIT, let 
K be the curvature of the conic at the point O, and let K, be the curvature at 
0 of the orthogonal trajectories to the straight lines y’=w. Then 


K sina + 2K, cos%=0. 


Property (A) is well known although it has not been stated in this con- 
nection before. The corresponding analytic statement is w, + wwy=—0. The 
(w,/w), in (30) then becomes — wy,. 

A property of sectional families derived from non-developable surfaces 
which is analogous to (A) may be obtained from (5) by differentiating, and 
eliminating f and its derivatives. The eliminant is found to be 


(Wy We) (Wiee + WW Weary Wi? Woy + We? Wiyy) 


This has the following geometric significance. Let 6 be the angle between 
the directions w, and w,, let y and T be the curvatures and let s and S be the 
lengths of arc along the curves y’ = w, and y’ = wy respectively; then 


d?6 ar dé dé 
— (COS — —— — 9 


dy dé ! a6 


This relation is a new characterization of asymptotic nets.” 


An alternative for Property III for developable systems is: 


™ References to other characterizations of asymptotic nets may be found in Fubini 
and Cech, “ Introduction a la Géométrie Projective Différentielle des Surfaces” (1931), 
p. 191. 


II 
at 
a 
ns 
| ds? ds ds ds 


822 ANNETTE VASSELL. 


Properly III’. The locus of the centers of the o' circles corresponding 
to the elements at a given point is a conic with that point as focus. 
For a given developable surface the conic is 


(= + + = Vireo + fw (= xX + +%) ‘ 

The analogue of III’ for families derived from non-developable surfaces 
has not been worked out completely but it is certain that the locus of the 
centers of the focal circles at a point is a cubic curve. The coefficients are 
very long expressions in terms of fry. fyyy and they are symmetrica! 
in the subscripts x and y. The constant term turns out to be M(L*— 4M)? 
where M is the Hessian and ZL the Laplacian of the surface. 


Ruled surfaces. Ruled surfaces may be characterized analytically by the 
fact that the cubic in y’ in the numerator and the quadratic in the denominator 
of G (that is, the coefficients of y” and y’” in (1)), have a linear factor in 
common. For the condition for the existence of such a factor is exactly the 
differential equation of ruled surfaces found by Monge. 

In order to give a characteristic set of geometrical properties for sectional 
families derived from ruled surfaces we have only to add the following property 


to the previous set. 


Property VII, One of the two families of integral curves of the directions 
W,, W2 consists of straight lines. 

This is so because a ruled surface is a surface one of whose two families 
of asymptotic lines are straight lines. In this case the family of straight lines 
is at the same time part of the sectional family since it is the projection of the 
rulings on the surface. These straight lines belong to the hyperosculated 
curves referred to in Property IV. 

The condition for VII is 


(Wir + (Wer + = 0, 
where w,,w, are given by (5). Incidentally it follows that this condition is 


equivalent to Monge’s equation for ruled surfaces. 


COLUMBIA UNIVERSITY. 


EXACTLY (k,1) TRANSFORMATIONS ON CONNECTED 
LINEAR GRAPHS.* 


By O. G. Harroxp, 


1. Introduction. In a paper by G. T. Whyburn [1]? there is given, 
among other results, a detailed study of the behavior of interior transforma- 
tions on linear graphs. The results suggest a connection between these trans- 
formations and transformations which are exactly (/,1).° This paper gives 
a more or less detailed account of exactly (4,1) continuous transformations 
defined on a connected linear graph. Since we are considering a connected 
graph. this type of transformation includes all local homeomorphisms defined 
on the given set [2]. 

In 2. results are given concerning exactly (4,1) mappings defined on 
Peano spaces of varying degrees of complexity. It is hoped that several of 
these results will be of use in attacking the problem of determining precisely 
what topological structure a set must have in order that an exactly (k, 1) 
continuous mapping can be defined on it for / > 1 [3].* In 3. an example is 
given showing that an exactly (3,1) image of a graph need not be a graph. 
In 4. the case /} = 2 is given special attention. It is shown that an exactly 
(2,1) image of a graph A is a graph B, and furthermore, there exist sub- 
divisions of 4A and B into finite complexes Ay and A‘g, respectively, such that 
the transformation of Ay into A’y is simplicial. A formula is given which 
not only relates the structure of A to B but also actually limits the type of 


sets A on which an exactly (2,1) mapping can be defined. 
2. Exactly (k,1) transformations in general. 


2.1. Let f(A) =B. where A and Bare ares. If f is at most (k,1) on A, 
there is an open sel U dense on A such that f is topological on each com- 


ponent of U. 


* Received March 7, 1940. 

1 National Research Fellow. 

*The numerals in brackets refer to the bibliography at the end of the paper. 

* All transformations considered in this paper are single valued and continuous. 
By an exactly (k,1) transformation is meant a continuous mapping such that each 
point of the image space has exactly k inverse points. By a (k,1) mapping is meant 
a continuous mapping such that every point of the image set has at most k inverses. 

* See also an abstract by J. H. Roberts in the Bulletin of the American Mathematical 
Society, 45-11-433. 

823 


824 0. G. HARROLD, JR. 


Proof. It is supposed that A and B are non-degenerate arcs. The asser- 
tion is true fork =1. For k > 1 it will suffice to show that on any sub-arc 
of A there is an open set U on which f is a homeomorphism. Denote the 
end-points of A by a’ and a? and those of B by b* and b*. It may be supposed 
that A is a sub-are of the given are A such that f-1(b1) = and f1(b?) =a. 
The inductive assertion is that the statement is true for /; —1. If every point 
in the interior of B has at most /;— 1 inverses in A, the desired open set U 
exists by our hypothesis. If ce B— (b' + b*) has k inverses in A, let the first 
and last of these in order from a’ to a be x’ and a?, respectively. There is 
an open interval V in B with x as an end-point such that for every ye JV, 
f(y) (@a' + has at most 2 points, otherwise, since f(z'r?) is a 
non-degenerate arc in B, some point in B would have at least / + 1 inverses. 
For some z on (or 27a?) the open interval W between: z and (or 
maps into a subset of V. By the choice of V, f must be (/—1,1) on W, 
hence by the inductive assumption, there is an open set U dense in W such 


that f is topological on each component of U. This proves 2. 1. 


2.2. Jf f is exactly (h,1) on the stably regular curve A,’ B=f(A) is 
stably regular. 


Proof. First, B can contain no non-degenerate continuum Y which con- 
tains no arc. For this would imply that f-?(-Y) is totally disconnected, which 
is not possible for an exactly (4,1) mapping.® Thus, assuming that B can 
contain a non-degenerate continuum .\ such that ¥ C B—X, we may take 
‘to be an are. It will be shown that this denial that B is stably regular 
leads to a contradiction. Since f-'(X) is not totally disconnected, there is a 
non-degenerate continuum Y C f*(\). Since A is stably regular, we may 
take Y to be a free in A. Then =\1C Hence we may apply 
2.1 and further restrict Y to be an arc mapping topologically into some arc 
in Clearly, C B—N'. Each point z in the interior of has 


exactly /:— 1 inverses in A — Y. Thus any are .,' wholly in the interior of 
A? has an inverse f-1(X,1)(A — Y) which is not totally disconnected, since 


f is exactly (4; — 1,1) on the compact set f4(V,1)(4—YV). Hence there 
is a free arc ZC A—/Y such that f maps Z topologically into ¥?C 17. As 


before, ¥?C B— \*, After a finite number of steps there are determined 


ik: free ares +.W in A such that each contains an are Ty, Tz,° 


>A continuum M is said to be stably regular (bestindig regular) provided that 
for no non-degenerate continuum 7 does TC M—T. 

®“See [3] and the references given therein. 

* The are 7’ is said to be free in XY provided no interior point of 7 is a limit point 
of X¥ —7'. It is essential to notice that a free are is a closed point set. 


EXACTLY (k,1) TRANSFORMATIONS ON CONNECTED LINEAR GRAPHS. 825 


mapping topologically onto X°, where X° is a non-degenerate are such that 
xX°C B—X°. Let z be any interior point of X°. Let > 2, tne B—X. 
Since f-"(a,) and (Y + W) have no common points, this implies 
that the compact set A — (Ty + Tz, +: --+ Tw) has an inverse to x, hence 
w has k + 1 inverses in all, which is not possible. This permits us to state also 


2.3. If f is exactly (k,1) on the stably regular curve A, to each non- 
degenerate arc G in B=f(A) there is a non-degenerate sub-are G* of G such 
that f-*(G@) consists of k arcs each mapping topologically onto G'. 


2.4. If f is exactly (k,1) on the Peano space A and x ts an end-point of 
B=f(A), each pdint of is an end-point of A. 

Proof. Denote the Urysohn-Menger order of x by o(x). Set f*(x) 
Let (T;*), 7=1,2,:--, ni, be a set of arcs in A 
each terminating at z‘, but with no other common point (by pairs). Suppose 


k ni 
each set chosen so that Tj” Then C=]] [I f(7j*) con- 
4=1 j=l 


tains an infinite sequence of distinct points converging to x since o(r) = 1. 


k 
x has }}nj inverses, hence each nj = 1 and 


Clearly, each point of C 


each o(a‘) = 1. 
If V.(x) denotes a region in B of diameter less than e, and if V.(2) —2# 
has o(#) components for e sufficiently small, then xz is an end-point of each 


such component. We have 


2.5. If f is exactly (k,1) on the Peano space A and xe B=f(A) is such 
that each sufficiently small region in B containing x is cut into o(x) com- 
ponents by the removal of x, then 0(a‘) S o(2) 
i=l 
2.6. Let f be exactly (k,1) on the stably regular curve A. If X is a free are 
in B= f(A) containing no point of = f(l°), where E° is the set of branch 
points plus end-points of A, then f*(X) has at most 2k —1 components. 


Proof. Wet X be a free are in B—FH. Then f1(Y) CA—E®. Since 
f is exactly (/,1) on A, f-!(X) is not totally disconnected. Let J be a non- 
degenerate component of f-*(X). By the choice of H, J is necessarily an are 
(a free arc).® By the continuity of f, each end-point of J must map into an 
end-point of XY (not necessarily the same end-point). There remain at most 


SIt is evident that J could be only a simple closed curve or are, since all of its 
points are of order two. The first possibility is ruled out by 2.7 (which does not 


depend on 2.6). 
9 


826 0. G. HARROLD, JR. 


2k — 2 points in f-*(X ) —J to be located which map into an end-point of YX. 
Clearly, any isolated point of f-*(°) must map, by continuity, into an end- 
point of X. Hence each component of f-*(X) —J contains at least one point 
which is carried into an end-point. ‘Thus f-+(X) —J can have at most 2k — 2 


components. 


2.7%. If f is exactly (k,1), k >1, on the continuum A, B=f(A) is not 
an are. 


Proof. Suppose, on the contrary, that B is an are for some exactly (k, 1), 
k > 1, mapping defined on a continuum A. Now it is known that if a regular 
curve (Menger) is obtained from a continuum A by an at most (k,1) con- 
tinuous mapping, then A is likewise regular [4]. Thus A will be assumed to 
be regular (hereditary local connectedness is sufficient). We take B to be the 
unit intervalO S[y1. First, there is a proper subcontinuum A? of A such 
that f is exactly (%,1) on and f(A*t) = where B' is the interval 
1. By 2.4, each point of f*(1) =a'+ 2?- - -+ is an end- 
point of A. Let 7',T?,---,T* be & arcs in A containing 2’,- - -,2*, 


k 
respectively, and such that T‘7/=—0,i1~j. The set C=J[] f(T‘) is an are 
1 


in B containing y—1. Setting S' = T‘f"(C) and noting that f is (1,1) 
on S‘, S¢ is an arc. In fact, since B is an arc, the arcs S‘ are free in A. Let 
the end-points of C bez andy=1. For any z < b' <1, the set yS0") 


is A minus k open free arcs (each containing an end-point of A) which is a 


continuum A? on which f is exactly (k,1). 
Next, the property of being a continuum in A on which f is exactly (4, 1) 


is an inducible property, for if A° = J] A‘, where each A‘ is a continuum in 
1 


A on which f is exactly (&,1) and A‘*? C A+, then ze A° implies f*f(z) C 
It follows that there is a continuum A° in A which is irreducible with respect 
to the property of being a continuum containing f-*(0) and on which f is 
exactly (k,1). If f(A°) is non-degenerate, we get a contradiction, for, A° is 
a Peano space (A is hereditarily locally connected) and the first remark will 
apply, hence A° is not irreducible. If f(A°) = 0, we again get a contradiction, 
for A°® is a continuum containing only 

2.8. If f is exactly (k,1) on the compact, hereditarily locally connected 
space A and B=f(A) is an are, there is an arc B‘ in B such that f*(B’) 
consists of k arcs each mapping topologically onto B'. 


Proof. The statement is trivial if B reduces to a single point. Since 
f is exactly (k,1), there exists a non-degenerate continuum A'C A. Since 


EXACTLY (&,1) TRANSFORMATIONS ON CONNECTED LINEAR GRAPHS. 827 


A is hereditarily locally connected, At may be taken to be an are. Hence, 
by 2.1, there is an are A, mapped topologically by f into B,tC B. Since 
f is exactly (/—41,1) on the half-compact space A — A,', there is a non- 
degenerate continuum A? C f-1(B,1) - (A4—A,'). As before, we take A? to 
be an are, and applying 2. 1, there is an arc A,? C A? on which f is topological. 
Set f(A’) = B,". Then A,” and A,’ each contain an are mapping topo- 
logically onto B,*. After k such steps, we obtain ares A,', A.?,- - -, Ax® each 
of which contain an arc Ct = 1 =1,2,- - -+,k, mapping topo- 
logically onto B,.. 


2.9. If f ws exactly (k,1) on the continuum A and B=f(A) is stably 
regular, so also is A. 


Proof. Case 1. The continuum A is hereditarily locally connected. 
Suppose 7’ is a continuum of condensation of A. Since A is hereditarily locally 
connected, we may take T to be an are. Let S=f(7). Let Y be a free are 
in 8S. Since Y is free, f-*(Y) can have only a finite number of components 
and is thus hereditarily locally connected. Applying 2.8 to f*(Y), there 
exists a set of & arcs T?, T?,- - -, 7* (mutually separated by pairs) in A such 
that each 7‘ maps homeomorphically onto YC Y. Since the sum of & 
mutually separated arcs cannot contain a continuum of condensation, there are 


k 
points in any neighborhood of a point ae T which belong to A — (7 + > 7"). 
1 
Let z be any interior point of the arc XY. Since f(T’) X, f'(z)T #0. Let 
- f(z). Let 32°, Then by continuity, 


f(t) > f(x). But f(x) is an interior point of a free arc X all of whose 
inverses have been located in = 7%, which is a contradiction. 


Case 2. The continuum A contains a continuum of convergence. The 
transformation f being at most (4,1) and A irregular (in the sense of 
Menger), B is likewise [4]. 


3. The original set A is a graph. If A is a connected linear graph, 
the image set B = f(A) under an exactly (k,1) mapping is a stably regular 
curve which has at most a finite number of end-points. The following example 
shows that an exactly (3,1) continuous mapping defined on a graph need 


not give a graph. 


ExamMPie. Two basic mappings of an arc into an arc will be defined. 
Let C be the interval OS ¢1. Let D be the interval OS y1. Let the 
closed interval of C(D) between (n + 1)-? and (n)-* be denoted by Cn(Dn). 
Define f as follows: Map C, topologically into D, with f(1) =1. Map C, 


t 
) 
2 
| 


828 0. G. HARROLD, JR. 


(topologically) into D, with f(1/2) =1/2. Map C; into D,+ Dz with 
f(1/3) =1. In general, C., is mapped onto D, such that as ¢ decreases 
y increases and Con, is mapped onto Dy-;-+ Dy, such that as ¢ decreases 
y decreases, Finally, f(0) 0. Each point in the interior of D has three 
inverses in C, while y = 0 has one and y —1 has two inverses. This will be 
called a mapping of type (#). By demanding that the point which generates 
D oscillate in the same manner near y = 1 as it does near y = 0, a continuous 
mapping of C on D is effected which has the same properties as the one above 
except that both y= 1 and y= 0 now have precisely one inverse. This will 
be referred to as a mapping of type (f). 

Let X° and be the intervals 0= y—1, 2,3, respectively. 
Let J denote r=0, 1S yS3. Set A=J+X'4 X74 Let Xné be 
the sub-interval of between x =1/n and e—1/(n+1). For a fixed n 
define on X,‘, 1=1,2,3 a mapping of type (8). Then identify the end- 
points of the image arcs corresponding to x = 1/n and identify the end-points 
of the image arcs corresponding to 1/(n-+ 1). Thus f(Xn' + Xn? + X,’) 
= Y, is a theta curve (i.e. of the form of the letter 6). Every point of Y, 
has exactly three inverses on X,'+ X,?-+ X,°. Repeating this for each 
n=1,2,3,--- and setting Y = SY,, we obtain a continuous exactly (3,1) 
mapping of X¥1 + X? + Y* onto Y. Evidently Y is merely the enclosure of a 
sequence of theta curves converging down to a single point such that Yn: Yn 
is a single point. Now on J define a mapping similar to type («) such that 
the points of JX’, JX? and JX® are identified. Thus f(J) =J* is a topo- 
logical circle. Setting B=f(A), we have an exactly (3,1) continuous 
mapping of the triod A onto B. The image B is clearly no graph, since it 
contains an infinite sequence of simple closed curves. 

A continuous transformation f is said to be locally interior at the point 2 
provided that f(x) is not a boundary point of the transform of any open set U 
containing z. The above mappings of type (a) and (f) fail to have the 
property of being locally interior at the points t=1/n, hence the above 
exactly (3,1) mapping is not locally interior at infinitely many points. It 
follows from 2.3 that any exactly (k,1) mapping defined on a stably regular 
continuum is locally interior except perhaps for a closed set of dimension zero. 


3.1. If f is an exactly (k,1), k >1, mapping defined on the graph A, 
B=f(A) contains a simple closed curve. 


Proof. It is to be shown that B is not a dendrite. From 2.4 and the 
fact that A is a graph it follows that B has at most a finite number of end- 
points. Let B have n end-points. The assertion has been proved for n =? 


EXACTLY (k,1) TRANSFORMATIONS ON CONNECTED LINEAR GRAPHS. 829 


(2.7). Assuming the statement true for a dendritic graph with n end-points, 
it will be shown to be true for n+ 1. Let @ be an end-point of B. Denote 
the maximal free arc containing t by Y. Set C=B—X. Set D=f(C). 
The property of being a continuum in A which contains D and on which f is 
exactly (k,1) is inducible. Hence there is a continuum A® in A which is 
irreducible with respect to this property. If f(A°) contains C as a proper 
subset, by precisely the same reduction as was made in the proof of 2.7 we 
can find a subcontinuum / in A°® such that C C f(/Z), f(ZZ) is a proper subset 
of f(A°) and f is exactly (k,1) on H. But this denies the irreducibility 
property of A®. Hence f(A°) =C. But C has one less end-point than B, 
hence, by the inductive hypothesis, f is not exactly (k,1) on A°. 

The preceding results, in so far as they apply to the case in which A is a 

graph, may be summarized as follows. 
3.2. Let f be a continuous exactly (k,1) transformation defined on the 
connected linear graph A. The image B=f(A) is a stably regular curve. 
The curve B is never a dendrite, and for k > 2 need not be a graph. The 
function f is locally interior at all points of A except possibly for a closed set 
of dimension 0. There is a closed set D of dimension 0 in B such that each 
component of B—D is an open free arc whose inverse 1s precisely k open 
free arcs in A each mapping topologically onto the common image. Hach 
free arc X in B containing no point of KH =f(E°), where H° is the set of 
end-points plus branch points of A, is such that f-*(X) has at most 2k—1 
components. 

While the above theorem fails to give an exact statement of what the 
image B will be in terms of the properties of A, it does show that an exactly 
(k,1) mapping on a graph has some of the characteristics of an interior 
mapping. For instance, it produces only a slightly more complicated curve 
than the original set. This is meant only in a relative sense, of course. It is 
known that an at most (3,1) mapping on an arc can increase dimensionality, 
while a (2,1) mapping on an arc can produce a curve containing a continuum 
of convergence. (For interior transformations the property of being a graph 
is preserved). 

4, Exactly (2,1) transformations on a connected linear graph. The 
results in this case are much more precise, as would be expected, of course. 
The underlying reason for this, actually, seems to be that 2.1 can be 
strengthened to read 
4.1. If f maps the arc A into the arc B and is at most (2,1) on A, then f 1s 


topological on A provided it preserves end-points.° 


®*See [3], Lemma A. 


830 0. G. HARROLD, JR. 


As intermediate conclusions to the main results of this section we show 


4,2. If f is exactly (2,1) on the graph A, then (i) there exists at most a 
finite number of points x in B=f(A) such that x is the vertex of a triod 
in B containing free arcs xy and xz; (ii) all but a finite number of the maximal 
free arcs X in B are such that f*(X) has exactly two inverse components; 
(iii) no point x in B is the vertex of infinitely many free ares. 


Proof. (i) Since the set H° of end-points and branch points in A is 
finite, f(H°) is a finite set. Hence if there were infinitely many points z in B 
with the asserted property, there would be one, say 2, such that xe f(H°). Let 
T be the enclosure of a region in B containing a triod with w as vertex and 
such that two of the ares of this triod, say zy and az, are free arcs. It is 
supposed further that Tf(E°) =0. Set =a'+ 2%. The points 2! and 
x? are in the interior of free arcs in A. Let X‘/, 1,7 = 1,2 be free arcs in A 
having only the point 7 = 1,2 in common and such that CT. It 
may be supposed that + + X**) =0. Denote three components 
of by zy, xz and W. Suppose Let pef (ay) -X"™. 
Then pz’. Denote the subare of from p to by pat. Let the last 
point on pz' in f*f(p) be p’. Then f is topological on p'a' by 4.1. Since 
f is exactly (2,1), f(xy) -X*? (say) #0. Similarly, there is an are p*z* 
in X?* on which f is topological. Hence there must be four arcs p'a’, p?2', 
q'z* and on which f is topological and such that f(p'a' + + 
+ q’a*) C ay+ az. Further, the sum of these four arcs contains an open 
set containing z'+ 2°. The continuity of f, however, implies that x have at 
least one more inverse, since W has z as a limit point. This denies the (2,1) 
property. The property (iii) follows from an analogous argument. 

(ii). The maximal free arcs are uniquely determined. Suppose there are 
infinitely many of them. There is one, call it K, such that each point of 
f-*(K) is of Urysohn-Menger order two and contains no topological circle of A. 
Since f is exactly (2,1), we have by 2. 6 that f-*(K) has at most 3 components. 
Since f is exactly (2,1), at least one of these components must be non- 
degenerate. By the choice of K it must be an arc. Let T be such a non- 
degenerate component of f-!(K). Since K is a free arc, end-points of 7’ must 
map into end-points of K. Case 1. If the end-points of K are contained in 
the image set of the end-points of 7, then by 4.1, f must be topological on 7. 
Since f is (1,1) on the set f*(K) —T, which has at most 2 components, 
f*(K) —T consists of a single component 7" which maps topologically onto 
Kk. Case 2. If one end-point of K contains the image set of the end-points 
of T, we distinguish two cases a) and bh), according as f(7’) contains the 
other end-point of K or not. In the first mentioned possibility it is easy to 


EXACTLY (/’,1) TRANSFORMATIONS ON CONNECTED LINEAR GRAPHS. 831 


show that 7’ is the sum of two ares T? and T? having only a common end-point 
and such that each f(7*) = K. Hence all points of K have two inverses on T 
except one end-point. Thus f*(K)—ZT consists of a single point, and 
f-*(K) has two components. In the second mentioned possibility, f(7’) = K* 
is a subarc of K. In this case 7 contains two inverses to all points of K* 
except the end-point of K* in the interior of K. Set K? = K — kK". Since 
K? is free, f-*(K*) = T° can only have a single non-degenerate component, 
which is seen to be T° itself. Each point in the interior of K* has two inverses 
in T°, The point K*- K? has one inverse in T and one in 7°. Thus any arc 
K in B of the type we have described has exactly two inverse components, 
i.e. f-?(K) has two components. 


4.3. Let f be exactly (2,1) on the connected linear graph A. Then B=f(A) 
isa graph. There exists subdivisions of A and B into finite complexes K4 and 
Kx such that the transformation of K4 into Kez 1s simplicial. 


Proof. First, B is a graph. To this end an upper semi-continuous 
decomposition of B is effected. From 2.2, there are free arcs in every open 
set in B. Any free arc is contained in either a maximal free arc or in a simple 
closed curve having just one point in common with the rest of B. (If B is 
a simple closed curve, our conclusion is already attained). To each free arc 7 
in B there is a connected set T containing T which is a sum of such simple 
closed curves (of the type just mentioned) and maximal free arcs and which 
is maximal in this regard. By 4.2 (i), (iii), P contains only a finite number 
of simple closed curves or maximal free arcs. Hence Tr is a graph. Now the 
elements of the decomposition are to be (1) the graphs [ which contain a 
simple closed curve or two maximal free arcs, (2) the maximal free arcs in B 
(not already in (1)) which have more than two inverse components, and, 
(3) the points in B not in an element of type (1) or (2). It will be shown 
that there is only one element in this decomposition. If there is only one 
element, clearly it must be of type (1). Next, the above definitions do give 
an upper semi-continuous decomposition. First, the elements are disjoint. 
From 4.2 (i), the elements of type (1) and hence all are closed. Also, by 
4.2, there are only a finite number of elements of type (1) or (2) so clearly 
this gives an upper semi-continuous decomposition. Let C be an image space 
of a corresponding continuous transformation g defined on B, g(B) =C. 
Evidently, C' can have no continuum of condensation, thus C' is stably regular. 
If C contains more than a single point, the inverse of a point in C with a 
finite number of exceptions is a single point. Denote this exceptional set by 
ECC. By the manner of selection of the elements of the decomposition, 


832 0. G. HARROLD, JR. 


no two maximal free arcs in C can intersect. Also, C can contain no simple 
closed curve having only one point in common with the rest of C. The curve 
C has only a finite number of end-points since B has only a finite number. 
(The function g is topological on B—g(H)). Hence we are in a position 
to define another upper semi-continuous decomposition, this one taking place 
on C. The elements of this decomposition are defined to be the maximal free 
arcs in C' and the points in no free arc. It is known that this decomposition 
defined on such a curve C gives a hyperspace D containing no free are [3]. 
Setting 4(C) =D, D=hgf(A). The continuous transformation of A into 
D can be factored into a monotone transformation f, followed by a light 
transformation f., where f,(A) = A* and f.(A') =D. This factorization 
may be so accomplished that the ‘ points’ in A’ are the components of inverse 
sets to the mapping hgf(A) =D [5]. Since f, is monotone and A is a graph, 
A’ is a graph.’° The set h(/’) contains only a finite number of points. Let 
x be any point of D—h(F). It is readily seen that either f-tg h(x) con- 
sists of exactly two ares (one of which may be degenerate) or two points in A 
(according as h(x) is an are or a point), hence fs1(a) consists of exactly 
two points. Thus the transformation f, carrying the graph A‘ into D is exactly 
(2,1) except for the points of h(#). Since D contains no free arc, there is 
a non-degenerate are T in D—h(F) such that TC D—T. Now precisely 
as in the proof of 2.2 (taking A = A’), this leads to a contradiction. Hence 
C is a single point, i.e. B is a graph. 

It will now be shown that there exist subdivisions of A and B into finite 
complexes K, and Kp, respectively, such that the transformation of K4 into Kz 
is simplicial. Let F be a finite set in B such that it contains all points of B 
of order ~ 2 and such that each component of B— # has an enclosure which 
is an arc uniquely determined by its end-points. Set H°=—f"(f). Add to 
a finite set F = f"f(F) such that each component of A — (/°-+ F) is 
an open free are (whose enclosure is an arc) uniquely determined by its end- 
points. Consider any component U of B— (H+ f(F)). Since each non- 
degenerate component of f-1(U) contains only points of order S 2 and contains 
no simple closed curve, the reasoning in the proof of 4.2 (ii) can be applied, 
hence f?(U) has two components. If U = K gives rise to Case 1, f*(U) 
consists of two disjoint arcs which are mapped topologically onto U. By 
definition of H and F, these arcs are edges of the complex introduced into A 
by the points / + #°. If U gives Case 2a, f*(U) has a single non-degenerate 
inverse component which is the sum of two ares, each mapping topologically 


10 See an abstract by W. T. Puckett and G. Watson, Bulletin of the American 
Mathematical Society, 43-3-182. 


EXACTLY (k,1) TRANSFORMATIONS ON CONNECTED LINEAR GRAPHS. 833 


onto U. Again these are already edges of the complex on A. If U gives 
Case 2b, the are U is subdivided by the insertion of a vertex at the point 
Kk*- K*. Denoting the augmented set of vertices in B by G, each component 
of B —G is such that its inverse under f in A is precisely two open ares each 
mapping topologically onto the component in B—G. Denote the complex 
induced on B by G by Ke and the complex induced on A by f-1(G@) by Ka, 
then the transformation f carries edges of K4 into edges of Kx» and in topo- 
logical fashion. 


4.4, Let f be exactly (2,1) on the connected linear graph A. For each 
point xeB=f(A) the relation =1/2 [o(a') + holds, where 


Proof. Suppose A and B have been subdivided into the finite complexes 
K, and Ky» of the last paragraph. Since f is exactly (2,1), each simplex in 
Kx is the topological image of two and only two of the simplexes of K4. The 
asserted relation is a direct result of enumeration. 

This formula shows that any exactly (2,1) mapping defined on an arc 
(or circle) can contain no point of order three, hence, after showing that the 
are and circle cannot be exactly (2,1) images of an arc, we have another 
proof that it is impossible to define an exactly (2,1) continuous transforma- 
tion on an are [3]. This relation also shows that any exactly (2,1) image 
of a simple closed curve is a simple closed curve. By fitting together two 
simple arcs to form a simple closed curve and defining mappings similar to 
those of type («) and (£), it is easy to show that an exactly (k,1) mapping 
on a simple closed curve need not give a simple closed curve for k > 2. Since 
this relation also implies that an exactly (2,1) transformation cannot be 
defined on any dendritic graph," it is of interest to give all of the possibilities 
(topologically) when A is a simple closed curve. 


4.5. Let f be exactly (2,1) on the simple closed curve A. Then B=f(A) 
is a simple closed curve and f is topologically equivalent to either (a) w= 2 
on|z|=1,or (b) w=2? on|z|—1 for A(z) 20, and w=2 on |z|—1 
for =0. 


Proof. The transformation f(A) = B is said to be topologically equiva- 
lent to g(A’) = B' provided there exists a pair of homeomorphisms fh and h* 
such that h(A) = A! and h1(B*) = B and f=h'gh (or g = (h*)“*fh*) [6]. 


11 A.D. Wallace pointed out that this implies 2(B) = y(A), where y is the Euler 
characteristic. Hence if A is a dendritic graph, no exactly (2,1) transformation can 
be defined on A. This result has been announced elsewhere. See P. W. Gilbert, abstract 


45-11-420, Bulletin of the American Mathematical Society. 


| 
) 

t 

B 

3 

h 

0 
1S 

1s 

) 

e 
ly 


834 0. G. HARROLD, JR. 


Two cases are distinguished according as f is interior or not. If f is interior 
on the simple closed curve A and is (2,1), and, if it is known further that 
the image is a simple closed curve, then f is topologically equivalent to w = z? 
on |z|=1{[1]. If f is not interior on A, there is a point z*« A and an open 
set U © z' such that «= f(z) is a boundary point of f(U). Let x? be the 
other inverse of z. Let h be a homeomorphism carrying A into the unit circle 
A’ in the complex z plane such that h(z') =+1 and h(a?) =—1. Let 
h* be a homeomorphism carrying the unit circle B* of the complex w plane 
into B such that h*(1) =z. Since f is not interior at z' (or 2’), it follows 
that each of the arcs of A determined by z' and a? map onto the whole of B 
under f. Denote by C and D the semi-circles of A1, &(z) 2 0 and &(z) S 0, 
respectively. Set g(z) = (h*)“fh*(z). Suppose as C is described from right 
to left that B* is described by g(z) in counterclockwise fashion. Then on 
h*(C) the function f is topologically equivalent to w—2z* on | z 
A(z) 20. Since e—f(z') is a boundary point of the transform of some 
open set containing x’ in A, B* must be described in opposite fashion as D is 
described from left to right. Hence on h~*(D) the function f is topologically 
equivalent to # =z? on |z|=—1, A(z) SO. 


= 1, 


THE UNIVERSITY OF VIRGINIA. 


BIBLIOGRAPHY 


1. G. T. Whyburn, “ Interior transformations on compact sets,” Duke Mathe- 


matical Journal, vol. 3 (1937), pp. 370-381. 

2. S. Eilenberg, “Sur quelques propriétés des transformations localement 
homéomorphes, Fundamenta Mathematicae, vol. 24 (1935), p. 36. 

3. O. G. Harrold, Jr., ‘ The non-existence of a certain type of continuous trans- 
formation,” Duke Mathematical Journal, vol. 5, pp. 789-793. 

4. W. T. Puckett, Jr., “Concerning local connectedness under the inverse of 
certain continuous transformations,’ American Journal of Mathematics, vol. 61 (1939), 
pp. 750-756. 5. 2. 

5. G. T. Whyburn, “Non-alternating transformations,” American Journal of 
Mathematics, vol. 56 (1934). 

6. G. T. Whyburn, “Completely alternating transformations,’ Fundamenta 


Mathematicae, vol. 27 (1936), pp. 140-146. 


( 
¢ 
( 
1 
i 
8 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS.* 


By Leonard M. BLUMENTHAL and GEorGE R. THURMAN. 


1. Introduction. We give in this paper the solution of a fundamental 
problem in the distance geometry of the n-dimensional sphere (surface) 
proposed some years ago by Karl Menger. 

Defining a semimetric space as a set of abstract elements (points), to 
each pair p,q of which there is attached a non-negative real number pq 
(distance) such that pg = qp and pg = 0 if and only if pq, the problem 
of characterizing metrically (i. e., in terms of the distance function) particular 
semimetric spaces among the whole class of such spaces naturally arises. For 
some of the more important spaces (e. g., euclidean, hyperbolic, spherical) 
the existence of a function mapping an arbitrary semimetric space congruently 
(i.e., with preservation of distances) upon the space follows from the con- 
gruent embedding in the space of each set of k points of the semimetric space. 
A space with this property is said to have congruence order k with respect to 
semimetric spaces. It has been shown that the n-dimensional euclidean, 
hyperbolic, and spherica! spaces have (minimum) congruence order n+ 3, 
while, on the other hand, Hilbert space does not have any (finite) congruence 
order with respect to semimetric spaces. 

The problem of determining necessary and sufficient conditions for the 
congruent embedding of any semimetric space in a given space is thus reducible 
to a “finite” problem in the case of those spaces possessing a congruence 
order ; for if S has congruence order /: with respect to semimetric spaces, then 
any such space is congruent with a subset of S provided each set of k points 
of the space is congruently embeddable in S. Now the class of semimetric 
spaces with minimum congruence order m + 1 contains a subclass (spaces with 
quasi congruence order m) for each member of which a further reduction in the 
characterization problem is possible. A space S has quasi congruence order m 
with respect to semimetric spaces provided any semimetric space containing 
more than m +-1 points is congruent with a subset of S whenever each m-tuple 


* Received September 30, 1939. 

1 Presented to the Society, December 28, 1938. A brief summary of results appeared 
in the Proceedings of the National Academy of Sciences, vol. 24 (1938), pp. 557-558. 

2In this paper the class of comparison spaces is invariably the class of semimetric 
spaces, and hence the phrase “ with respect to semimetric spaces ” is frequently omitted. 


835 


836 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


of its points is congruently embeddable in S. The n-dimensional euclidean 
and hyperbolic spaces belong to such a subclass of the class of semimetric 
spaces having minimum congruence order n + 3, since each of these spaces has 
quasi congruence order n + 2. On the other hand, the n-dimensional spherical 
space S,,, (the “surface ” of a sphere of radius r in a euclidean space of n + 1 
dimensions, with geodesic (shorter arc) distance), though it has, as remarked 
above, minimum congruence order n + 3, is not a member of this subclass since 
the S,,, does not have quasi congruence order n + 2. 

That this is the case is immediately verified upon noting that the S,,, 
contains an equilateral set of n + 2 points (i.e., a set of n + 2 points with 
all of the $(n +1)(m+2) mutual distances equal) but does not contain an 
equilateral (n + 3)-tuple. Hence, if P is a space of arbitrary power exceeding 
n +- 3, such that pg =1- cos"*(—1/(n + 1)), (the “side” of an equilateral 
(n +°2)-tuple of S,,,) for p-Aq, and pg =0 when p—gq, (p,qeP), then 
P is a semimetric space, containing more than n+ 3 points, which is not 
congruent with a subset of S,,, though each set of n + 2 points of P is con- 
gruent with n + 2 points of S,,,. A semimetric space which is not congruent 
with a subset of the S,,-, though each n + 2 of its points may be embedded 
congruently in the S,,,,, is called a pseudo-Sy,, set. As illustrated by the set P 
defined above, pseudo-S,,, sets may be of arbitrary power exceeding n + 2. 
This is in marked contrast to the analogous pseudo-euclidean sets, for a 
pseudo-F,, set is restricted to consist of exactly n + 3 points, due to the quasi 
congruence order n + 2 property of the #,. The metric structure of pseudo-/, 
sets is readily described.® 

The characterization of pseudo-S,,,, sets was proposed by Menger in 1931.4 
The equilateral set P is a pseudo-S,,, set, but are all pseudo-S,,, sets of this 
simple structure? The principal result of this paper permits us to say that 
if a pseudo-Sy,, set contains more than n+ 3 points, and no two of the points 
have a distance equab to d =-r, then this query is to be answered essentially 
in the affirmative. The meaning ofthe qualification of the above statement 
given by the word “essentially ” will become clear later. 

Pseudo-S;,,,, sets of exactly n + 3 points — it is proved (Theorem 6) that 
no diametral pair of points (i. e., two points with distance d) can occur in such 
a set — have a more varied structure. These sets are described by use of the 
spherical analogue of the isogonal conjugate transformation of the plane. The 
case n = 2 of the ordinary sphere illustrates all of the essential features. 


’See L. M. Blumenthal, “ Distance geometries,” University of Missouri Studies, 


vol. 13 (1938), pp. 63-64. ' 
‘For n=1 the term pseudo d-cyclic is used. Concerning the characterization of 


pseudo d-cyclic and pseudo-S,, sets see “Distance geometries,” pp. 74-81. 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 837 


It is easily seen that a semimetric set of five points p,, p2,°-*, ps forms a 
pseudo-S,,, quintuple if and only if the sphere contains five points 81, S2,°°* , 85 
such that (1) s; is equidistant (with distance R=45s,s;) from the three 
reflected images s,/, s,/’, s,// of the point s, in the great circles C(s2, ss), 
C'(s:, 83), C'(si, 82), respectively, determined by the independent (i.e., not on 
a great circle) points s,, 82,8, and (2) the mutual distances of the points 
P2»* * Ps equal the corresponding distances of the points s,, Ss 
except for the distance pyp;, which equals RP instead of s4s;.° Thus, the four 
distances pps, P2Ps, PsPs, and psp; = R are functions of the six mutual dis- 
tances of the four points p,, po, pz, ps The actual expressions for these four 
distances have not been computed (nor has the analogous computation for 
pseudo-plane sets been made). One should note here a complication that arises 
in the spherical analogue of the isogonal conjugate transformation which is 
not present in the plane. <A point s; of the so, is not uniquely determined by 
being equidistant from the three points s,/, s,//, s,/’—a pair of diametral 
points satisfies this condition. Thus, the transformation on the sphere is not 
one-to-one, as it is (apart from certain exceptional points) in the plane, but 
is one-to-two, or rather, two-to-two, since a diametral pair is transformed into 
a diametral pair. 

The process sketched for pseudo-S,,, quintuples is representative of the 
procedure for pseudo-S,,, (n + 3)-tuples. The determination of their metric 
structure is, then, so closely related to the analogous problem for pseudo-F, 
(n + 3)-tuples as to present nothing essentially new. On the other hand, 
one may surely expect quite different results when pseudo-S,,, sets of more 
than n + 3 points are considered, for their euclidean analogues (pseudo-Hy 
sets of more than n + 3 points) do not exist. Furthermore, the complication 
due to the one-to-two character of the transformation described above makes 
itself felt only for pseudo-S,,, sets of more than n + 3 points. Such sets may 
contain diametral points unless the contrary is explicitly assumed. 


2. Basic and derived properties of the S,,, and its subspaces. Many 
properties of the S,,, and its k-dimensional subspaces Sz,r, 0S k =n, (the 
sections of the S,,, made by (k + 1)-dimensional hyperplanes through the 


“surface ” is the Sn,r) are needed for the 


center of the sphere in En,, whose 
development of this paper. Looking towards the “ abstraction ” of our prob- 
lem made in Section 4, we isolate here the five basic properties of the Sn,r that 


5 Throughout this paper the points of pseudo-S, , sets are denoted by the letters 


p and q, while the points of S, , are symbolized by the letters s and ¢. 


838 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


suffice to demonstrate all the additional properties of the Sy,r that we need for 
this investigation.® 


I. The determinant 


Ans2 ($1, 825° *8ns2) = | Cos (848;/r)|, 
(1,7 = 1, 2,- -,n+2), 


vanishes for each set of n+ 2 points 81, 82,° * +, Of Sn re 


II. There exists at least one set of n+ 1 points of Sn, whose determi- 
nant An, does not vanish. 


III. Lach finite subset of Snr has a non-negative determinant A. 


Remark. It is easy to show that the dependence (independence) of a 
finite set s,,%2,-- +, s% of k points of S,,,, is equivalent to the vanishing (non- 
vanishing) of the determinant A; of the & points. We call & points of a 
semimetric space independent (dependent) if they are congruent with k 
independent (dependent) points of S,,,. 


IV. If 81,82,° tr, +, are two congruent sets of 
k-+-1 points (not necessarily pairwise distinct) of two (coincident or distinct) 
k-dimensional subspaces of Snr, k Sn, then to each point s of the sudspac: 


containing 81, 82,° * * there corresponds at least one point t of the sub- 
space containing ty, to,- +, tks such that 
81, 82,° ti, to, tests t. 


V. If 81, 82,° ++, are k independent points of (k =1,2,---,), 


then corresponding to each point s of Sx, independent of 81, 82,° (1 @, 
(81, * * Sx, 8) does not vanish), there is at least one point of 
such that 5&8 and 8), 82,° * 8k, 8 81, 82,° 8, 


We list now for convenience the derived properties of the Sn,, to which 
reference will be made in the next section: 


(a). A semimetric set of k + 2 points is congruent with k + 2 points 
of S;r, kn, if and only if each k +1 of the points are congruent with 
k +1 points of the S;,, and the determinant Ax,» of the & + 2 points vanishes. 


®*The derivation of these additional properties from the five basic ones follows 
closely the methods of an earlier paper (L. M. Blumenthal, “The geometry of a class 
of semimetric spaces,” Téhoku Mathematical Journal, vol. 43 (1937), pp. 205-224). 
‘This notation signifies that s,s, =t;t;, 


1 

J 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 839 


(b). A semimetric set of & + 3 points is congruent with k + 3 points 
of Sz,r, k Sn, if and only if each & + 2 of the points are congruent with 
i; + 2 points of the S;,, and the determinant Ax,3 of the k + 3 points vanishes.® 


(c). If 81, Star th, tess are two congruent sets of k +1 
independent points of two (coincident or distinct) k-dimensional subspaces, 
k Sn, and if s,s’ are points of the subspace containing s,, $2,° , Sk41, while 
are points of the subspace containing f,, - such that 


81, 82,° ° 5 ti, to, tists t, 
81, S2,° Ski1, 8 ~ t1, to, that 


then ss’ = tl. 


Remark 1. If 81, 82,° +, ti, are two congruent sets 
of k + 1 independent points of two: (coincident or distinct) k-dimensional 
subspaces, k = n, then to each point s of the subspace containing the first 
set of & +1 points there corresponds exactly one point ¢ of the subspace 
containing the second set of k +1 points such that 


Remark 2. There is at most one point of S;,, with prescribed distances 
from k + 1 independent points of Sz,,, k Sn. 

(d). Any subset of an independent set of points is an independent set 
of points. 

(e). If 8, are k independent points of Sz,r, (kK 
then corresponding to each point s of Sj, which is independent of them there 
is exactly one point s’ of S;,, such that s’ As and 


~ 9 
Si, 5 Sky Sj, Se, > Sk, 


(f). Let s,, be & independent points of S;,,, (tk =1,2,---.), 
and let s,s’ and t,t be two pairs of points of S;,- such that ¢ is either 
dependent on 8% or distinct from s, and is either dependent on 


$1, 82,° ° *, 8, or distinct from s’, while 


. J . . > , 
S81, Se, 81, 82, » Set. 


Then ss’ tl’. 


8 Properties (a), (b) are well-known theorems in the distance geometry of the 
8, , (see “Distance geometries,” p. 73). That they may be obtained by using merely 


Properties I-V has not been recorded heretofore. 
°The point s has a single image s’ ~s when “reflected” in the (k—1)-dimen- 


sional subspace determined by the k independent points 8,,8,,- - -, 8, 


= 
= 


840 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


(g). Let 81, be & +1 independent points of Sir, kSn, 
and let s be a point of S;,, not common to any two of the k + 1 subspaces 
Sx-1,, determined by the & points 


81, 5 Si-1y 5 Sk+1, (1 = -*,k+1). 
Denote by s‘*) a point of S;,, such that 


(1 = 1, 2,- : -,k+1), 


81, * Si-1, 


where ~s if s is independent of s,,82,° *,Si-1, Then 
there are not two independent points s’, t’ of S;,- such that 


Remark. There are at most two points of S;,, satisfying the conditions 
of property (gq). 


(h). The S;,, kn, has minimum congruence order k + 3.*° 


(i). If +, 8x4, are k +1 independent points of Si, 
and s is a point of S;, such that for each integer 1, (1 =1,2,---,k) the 
points Sit, Sks ae Gependent, then the points s, 


are dependent. 


(j). If s, t, w are any three distinct points of S;,,, k =n, such that 


st = tu = su, and if s,, 8.,---,s, are k points of S;,, such that 
Sis = sit = 
then the points s,,s2,° are dependent. 


(k). Let s,¢ be any two distinct points of Sz, (k =1,2,°°-,n), 
and let s,,5.,- be k independent points of such that sis = sit, 
(t=1,2,---,k). The (k—1)-dimensional subspace S;_,,- determined by 
$1, S2,° °°, 8 is the locus of points of S;,, equidistant from s and f¢. 


(1). Two distinct k-dimensional subspaces S;,,, k Sn, can have at most 


k independent points in common. 


8. The characterization of pseudo-S;,, sets, km. We deduce first 
some preliminary theorems concerning pseudo-S;,, sets, kn, that point the 
way to the desired characterization theorems. 


10“ Distance geometries,” p. 73. 


| 
| 
| 
| 
| 
| 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 841 


THEOREM 1. A pseudo-S;,, (k + 3)-tuple px, poy * *, Pers contains at 
least one independent set of k +-1 points. 


Proof. Since po,* Pers is a pseudo-S;,, set, each (k& + 2)-tuple 
contained in these points is congruent with k + 2 points of S;,,. It follows 
(property (a)) that the determinant A,,. of each (k& + 2)-tuple vanishes. 
Suppose, now, that, each set of & + 1 of the k + 3 points is a dependent set. 
Then the determinant (ps1, Pers) has all principal minors of orders 
k + 1 and k + 2 equal to zero, and consequently vanishes. It follows (property 
(b)) that the & + 3 points are congruent with a subset of the S;,,, which 
contradicts the hypothesis that they form a pseudo-S;,, set, and establishes 


the theorem. 


THEOREM 2. The determinant Po,* * Puss) Of a pseudo-Sy,r 
(k + 3)-tuple is negative. 


Proof. As seen in the proof of Theorem 1, the determinant 


Anis (Pr, Puss) 


does not vanish. Let the points be so labelled that py, po,- + +, Pes: is an 
independent 1)-tuple. Since 


we have !? 
—[k+2,k+ 3]? 
(P15 * * 5 


where [k + 2, + 3] denotes the co-factor of the element in the (k + 2)-nd 
row and (k+3)-rd column of * Pers)» The points 
* *5 Pes. being independent and congruent to k +1 points of Sx,r 
implies > 90, (property III), and the theorem is 
proved. 

TueEorEM 3. If * 5 Piss form a pseudo-S;,r set, with the points 

11 We shall suppose the index k to assume the values k = 0,1,2,- - -,” except when 
it is stated otherwise. 

12 The expansion of a (k + 3)-rd order symmetric determinant 

k+2,k+3 
= c+2,k+2]-[k k+3] —[(k+2,k+ 3]? 

(kK +244 2]-(k+3,k +3] —[k+ 
(a special case of Jacobi’s theorem), is particularly useful whenever, as frequently 
happens in this paper, one of the co-factors [k + 2,k + 2], [k + 3,k + 3] vanishes. 


10 


842 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


Pr; 5 Pes Independent, then the S;,, contains k +- 3 points sy, 82,°** 5 
such that 


Pris Pk+15 Pk+3 ~ 82,° ° * 5 Skt1, Sk+35 
ANd F 


Proof. Since p,, p2,***, Pes form a pseudo-S;,, set, there exist two sets 
S1, 82,° * ANA Of +2 points of such that 


Pris Pk+15 Pk+2 ~ Si, So, Sk+15 Ski2, 
Piz * * 5 Pk+ty ~ ti, to, 


The points p;, being independent, then {s;} and {t;}, (t= 1, 2,---, 
k-+1), are two congruent sets of & +1 independent points of Sj, and 
hence (Remark 1, property (c)) the S;, contains exactly one point sz; 
such that 

Then we have 


Pw P2> Pk+3 81, 825° * 5 Sk+15 Sk.35 


and since the k + 3 points p,, Press form a pseudo-S;,, set, the dis- 
tance does not equal the distance 
The k+3 points 8), of are said to be “almost 


congruent ” to the pseudo-S;,, set p1, * 
THEOREM 4. Let Piss form a pseudo-Sy,, (k + 3)-tuple with 
the independent (k+-1)-luple py, po,- Pra. Then the k+1 points 
Proof. Let 81, 8,° + *,Sx.3 be k +3 points of S;,, almost congruent to 


and Sk+28k+3- Since each k 2 points of the set py, Piss 
are congruent with & + 2 points of Sir, we have 


and hence 


with the points s2,83,° - *, Ss. being independent since they belong to the 
independent (k + 1)-tuple s;, (property (d)). 


| 
| 
| 
| 
| 
| 
| 
{ 
| 
| 
| 
| 
| 
| 
| 
| 
| 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 843 
Property IV applied to the congruence 
to, * * Soy * Skat 


entails the existence of two points s,s’ of S;,, such that 


Then 
Suppose, now, that the points ps, ps,° are dependent. Then 
Soy * Sks1) Skxg are dependent, and applying Remark 2, property (c) to 


the Sx1,. determined by the & independent points 83,° Sks1, the above 
congruence gives = S42, and hence fs, , tere, larg are congruent 


with So, S3,° * * 5 Sk415 in the usual order. 
Now 
So, 5 ~ P2 Pk+3 ~ to, ts, bicsty thas, 
which gives 83,° * * 5 S25 If is not dependent 


ON S2,S3,° * * Sk, then by property (e) the point s’ may be distinct from 
Sixge But then the last congruence together with the congruence 


shows that sx.2Si13 = Sx428’, according to property (f). If, on the other hand, 
Sixg 18 dependent on So, $3,° * * , then the congruence 


evidently implies s’ = ANd = as before. Hence, in any case, 
> 4 
== = = Pk+2Pk+35 
which gives the desired contradiction and establishes the theorem. 


Remark. It is clear that the method of the preceding theorem may be 
used to show that each of the (k + 1)-tuples 


is independent. 


THEOREM 5. Let py, * * Pers form a pseudo-S;, set with the 
independent (k +. 1)-tuples py, po,* * Pky Dis; 


t 
d 
3 
t 
0 
3 


844 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


Pi, 5 Pi-1> Pitty’ 5 Pk+2s Pis Pi-1y Pitts’ 


Then the (k +-1)-tuple ps, independent. 


Proof. The proof of this theorem follows the lines of the proof of the pre- 
ceding theorem, with the independent + 1)-tuple py, ps, 5 
in the role of the (k + 1)-tuple py, po,- Peo. Thus, the Sz, contains 
k + 3 points * * , Such that 

Pru Pas Pa" Pk+19 Pk+2, Pk+3 ~ 835 S4,° 5 Sk+29 Sk+35 

Pis Pos * 5 Po 81, 83, * * Sk+1y $2, 
With poPr.z ~ S28x%,3, and the same procedure as in Theorem 4 leads to the 
desired result. 


Remark. Ina similar manner, it is seen that the (k + 1)-tuples 


(1,7 =1,2,°°-,k +1; 1-9), 
are all independent (k + 1)-tuples. 


We have thus proved the following useful theorem: 


THEOREM 6. If py, * * 5 Pers form a pseudo-Sy,r set then each 
of the 4(k +2)(k+8) sets of k +1 points contained in this (k + 3)-tuple 
is an independent set. 


Let Po,*** form a pseudo-S;,, (k + 3)-tuple, and let s1, 82,°°* , Skis 
be k + 3 points of S;,- almost congruent to them. Consider, now, the points 


The S;,, contains k + 2 points t,, lia, tag such that for each 
1=—1,2,---,k+1, 


Then 


is congruent with the points 
ti, to, ti-1, bist, thats tks25 
with each of the sets of k + 1 points 


S1,S2,° * 5 Si-1y Sk+2, 1, 25° -,k+1), 


| 
| 
| 
| 
| 
| 
| 
i 
| 
| 
i 
| 
| 
| 
| 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 845 


being independent (Theorem 6). Hence, by Remark 1, property (c), the Sx,r 
contains exactly one point s(@ such that 


-,k+1), 


and hence 
+ 
Since 


~ Sis Sis 9 Sk+15 Sk+35 


the point has the same distances from the points 
c+ 
81, S2,°  Séi-1y Sitty’ * 5 
as the point but (4) for 


11) = Pki2Pk+3 

We have 

for each of these distances sina Pis2Pk+3- Now a point 8,42 is not determined 
uniquely by these inequalities, but by property (gq) and its accompanying 
Remark there is at most one other point s*;,,. equidistant from the k + 1 

points and == d. 
In a similar manner it is seen that the S;, contains k + 1 points si), 

+, such that 


k+2 
(i= 1,2,---,k+1), 
with each of the distances +1), equal to prsoprss. 
Thus is equidistant from the + 1 points s™, and there 


are at most two points (diametral) satisfying this requirement. 


THEOREM 7%. Let P=(1, * Pres) and Q= (qi, * ess) 
be two pseudo-Sy,r (k + 3)-ltuples such that 


Then either = 1,2,° +2), and the two (k+3)- 
tuples are congruent, o1 


846 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


COS (PiPuss/T) + COS (GiGirs/") = 0, (t= 
Proof. From the hypothesis of the theorem we may write the following 


congruences : 


Pi> * * 5 S15 * 5 Sk+1y 
Pis Pos’ 5 Pk+g ~ $1, * 5 Sk415 
* 5 ~ $1, * 5 Skaty tess, 


where the points on the right-hand side of these congruences are in S;,, and 
Pk+2Pk+3 Sk+28k+35 FF Skrotksse 


From the first two congruences follow, as has been seen, the existence of 
points sti), of such that 


Similarly, from the last two congruences, we may write 


It follows (property (g)) that either sx; = OF In the first 
case, Clearly pipers = Qidirs, (1 =1,2,- - -,k +2), and In the 
second case, and being diametral implies (property III) that the 
determinant A; (Si, tkxg) 9, and hence (upon expanding) 

COS + cos =, 
Then, by the above congruences, 

COs + 608 = 0, 


Finally, = d implies that A. Vanishes; i. e., 
C+ 


cos +- cos (5) = 0. 
Then 
COS + COS = 9, 


and the theorem is proved. 


Lemma. If a pseudo-S)., set P of k +4 points 5 Desa, has no 
pair of its points diametral, then the set contains at least three pseudo-Si,r 


(k + 3)-tuples. 


18 With a view to the “abstraction” of the problem treated in Section 4, we write 
cos + cos =9 rather than p;p,,. + GWU = d. 


| 
| 
} 
{ 
| 
| 
. 
( 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 847 


Proof. Since, by property (h), the S;, has congruence order k +- 3, 
the set P contains at least one pseudo-S;.,, (/ + 3)-tuple. The labelling may 


be assumed so that pi, po,* * *, Pers is a pseudo-S;, set. In case P does not 
contain at least two (k + 3)-tuples congruent with k + 3 points of Sz, the 
lemma is surely valid. In the contrary case, let p,, * Pers and 
Dis * * 5 Pests be congruent with two (k-+ 3)-tuples of 
Since Pi, 18 a pseudo-S;,, set, we have 

P2> Pk+15 Pr+e S2,° * 5 Skit, Sk+25 


With and it follows that 


(I) Pis Pos’ 5 Pk+a S15 * Sheet, Sk+29 Sk+ay 

and the point s;,, of S;,r is uniquely determined since its distances from the 
k + 1 independent points Of are fixed. Now, by hypothesis, 
each pair of points of P is independent (i. e., no two points of P are coincident 
or diametral) and hence the points s;,4 and s;, (1=1,2,:--,k-+3), are 
independent. It follows that at least two of the k + 1 sets 


$1, * 5 Si-ty Sk+ay (4 =1, 2, + 1), 


are independent (& +1)-tuples, for in the contrary case, the k dependent 
+ 1)-tuples have, in addition to the point one of the points 82,°** 
in common. This point is then, by property (i), diametral to (or coincident 
with) the point s;,,, in contradiction to the preceding remark. 

Let, then, Sks1y Skea ANG 81,83,° * 5 Sk1 Steg be independent 


=> 


(k + 1)-tuples. We show that the two (k + 3)-tuples 


Pes Pk+15 Pk+25 Pk+4 
and 


are pseudo-S;,, sets. (A similar procedure gives the desired result in case two 


other (k + 1)-tuples are independent. ) 


We make the assumption that po, Pkses Para CONZTuent 
with k + 3 points of S;,, and show that this assumption leads to a contra- 


diction. (The other (k + 3)-tuple is treated in the same manner.) 


Suppose, then, that 


r 
e 


848 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


of S;,,. Then (using congruences (1)) it follows that 


with 82,83,° * *,Sks1,Sk+4 an independent (k-+1)-tuple. It follows that 
Sk+28k13 = = (the first equality resulting from property (c) ) 
gives the desired contradiction, and proves that the (k + 3)-tuple 


P2s 5 Pk+29 Pk+39 Pk+45 
is a pseudo-S;, set. 


THEOREM 8. If a pseudo-S;,- (k + 4)-tuple P has no pair of its points 
diametral, then each (k 4+ 3)-tuple contained in P ts a pseudo-S;,r set. 
Proof. From the Lemma, P contains at least three pseudo-S;,, (k + 3)- 
tuples, say 
Pa Pk+15 Pk+25 Pk+3 5 Pr P2> Pk+15 Pk+25 Pk+45 
P15 P2> Pk+1s Pk+35 
Applying Theorem 7 to the first and second of these pseudo-S;,, sets, and 


then to the first and third of these sets, we see that the distances determined 
by the points of P satisfy one of the following four sets of relations: 


Case I. Pks2Pk+s = Piss = 
PiPks2 = PiPirs = 
Case II. O08 = COS (Press = — COS 
COS = COS (PiPrss/%) = — COS 
Case IIT. C08 (pis = — COS = COS 
COS (PiPks2/?) = — COS == — COS 
(t= 1,2,---,k+1). 
Case 1V. COS == COS == — COS 
COS = — COS ( PiPess/1) = COS 


4-4). 


To show, for example, that ps, Pk+2, 18 a pSeudo-Sy,+ 
(k + 3)-tuple, assume the contrary and let S2, , Sk+35 Shes be 
k +3 points of S;,, congruent to them. Then the distances determined by 
these k + 3 points of the S;,, satisfy one of the above four sets of relations 
(with the index i taking on the values 2,3,---,k-+ 1). 


| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS, 849 


Case I. Applying property (j) to S0,83,° * *,8ri4, it is seen that 
S2, * are dependent. Then the points po, ps,° *, Pes. congruent 
to them are dependent, which is not possible (property (d)) since these i: 
points are contained in the k + 1 points py, p2,- ++, Pri: Which are independent 
(Theorem 6) since they belong to the pseudo-S;,, (k + 3)-tuple p,, po, 


Case II. The points sz, 83,°* +, Se. are each equidistant from the points 
S425 Sk+g3 and since these k points are independent it follows from property (k) 
that the (k—1)-dimensional subspace S*;_,,, determined by them is the 
locus of points of equidistant from and Now the points 
S2, 83° °°, Sks1 are also contained in the locus of points s of S;,, such that 
COS + COS 0. Since = the points Siro, 
are distinct and not diametral. We prove now the following assertion: 


ASSERTION. The locus of points t of Sy such that 
COs + C08 = 0 
is the (k —1)-dimensional subspace of Sz, determined by 82, 83,° * * 5 Sks1- 


To prove this assertion, it is shown first that if ¢,, t2,---,t.4, are k +1 
pairwise distinct points of such that cos + cos = 0, 
then the determinant Ag,; (41, +, of these points vanishes, and hence 
the points are in an S;,,. For suppose this determinant does not vanish, 
and consider Ax,3(t1, + Stra), Which clearly equals zero. Adding 
the elements of the last row (column) to the corresponding elements of the 
preceding row (column), and using the expansion ?* employed in Theorem 2, 


we obtain, after some obvious reductions, 


Since ~ d, it follows that te, vanishes, contrary to 
our supposition. Hence each set of k + 1 points of the locus is a dependent 
set, and the locus is at most ( —1)-dimensional. But the locus contains the 
k independent points $2, 83,°** , Sz: Of S;,r. It follows that the locus is exactly 
— 1)-dimensional. 

Finally, let s A s;, (i= 2,3,---,4 +1) be any element of the (k—1)- 
dimensional subspace determined by $2, *, Sk. Now the determinant 
(S25 835° 5 Skea, 8) zero, and every principal minor of order 
k + 2 vanishes. It follows that every (k + 2)-nd order minor of the deter- 
minant is zero. Adding the (k + 2)-nd row (column) to the preceding row 


14 See footnote 12. 


850 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


(column) —-a transformation of the determinant which leaves the rank 
unaltered — and expanding, as above, the vanishing (k + 2)-nd order minor 
[k& + 2,4 + 2] of the resulting determinant, we obtain, 


2[1 + COS ] Ansa (S25 82° * S) 
— [cos + cos |]? * An (S2,° Sear) 


Since * 5 =O and Ax(S2, S3,° * Goes not vanish, 
we have cos (8842/1) + cos (8sx,4/7) =0, and s is a point of the locus. 
Hence the assertion is proved, and the locus in question identified with the 
(4 —1)-dimensional subspace S*;_1,, determined by the k independent points 
82, 83,° * 5 Sk+is 

But this is impossible, for since cos (Sg,2Sr42/7) + COS (SkssSk+4/7) = 9, 
the point s;,, belongs to the above locus, though it surely does not belong to 
for is not equidistant from the points S440, 

This contradiction shows that the distances determined by the points 
So, 835° * 5 Sk+2y Sks3 GO not satisfy the relations of Case IT. 

Interchanging with in Case III, and with in Case IV 
reduces these cases to Case II, and hence the assumption that 


25 P2> Pk+15 Pk+25 Pk+35 Pk+4 


is not a pseudo-S;,, set leads to the distances determined. by these points not 
satisfying the relations of any one of the above four cases. This contradiction 
proves the k + 3 points from a pseudo-S;,, set. A similar procedure is used 
to show that the & remaining (k +- 3)-tuples of the set P are each pseudo-S;,, 
sets, and the theorem is established. 


CoroLiary. If P is a pseudo-Sy,r set containing more than k + 3 points, 
no paw of which is diametral, then every set of m=k-+ 38 points of P is a 


pseudo-Sx,r set. 


Proof. Since P is a pseudo-S;,, set there is at least one pseudo-Si,r 
(k + 3)-tuple q:, contained in P (property (h)). Clearly, any 
subset of P that contains these k + 3 points is a pseudo-S;,, set. Suppose, 
now, that p,, * Pm is a set of m=k-+3 points of P not 
containing any of the points q:,q2,° Then 1, * 5 Pr IS a 
pseudo-S;,, (k +-4)-tuple without diametral points and hence, by the pre- 
ceding theorem, each set of & + 3 of these points (in particular, the set 
Jos 18 a pseudo-S;,, set. Then 2, Pi, P2 IS a 
pseudo-S;,,, + 4)-tuple without diametral points and hence, as before, 


| 
i 
| 
i 
| 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 851 


is a pseudo-S;,, (k + 4)-tuple without diametral points. It is clear that this 


process may be continued until the points 41, are replaced by the 
points p;, Po,* forming a pseudo-S;,,, (& + 3)-tuple. Then the points 


P15 Pm Surely form a pseudo-S;,,, set. 

Finally, if i= of the points occur among the 
m-tuple of points p;, * Pm, We have, with convenient labelling, 
Pi = (J =1, and the above process, starting with the pseudo-S;., 
+ 3)-tuple * * Pip * Pi, is applied as before to 
complete the proof of the corollary. 


LemMaA. Let P= (pr, * 5 Pera) DE a pseudo-Sy,r (k + 4)-tuple 
without diametral points. Then pipm = pjpm or 


COs (PiPm/T) + cos (pjpm/r) = 0, 
(1, j, m=1,2,° e+ 4; 
Proof. If ij the lemma is trivial. Suppose, then, 1 7, and consider 
the two (k + 3)-tuples 


obtained from P by omitting, in turn, the points p; and p;, respectively. 


According to Theorem 8 these two (k + 3)-tuples are pseudo-S;, sets, and 
since the (& + 2)-tuple 


P1> P25 Pi-1 Pi+1s 9 Pi-15 Pi+1s 
is common to both sets, an application of Theorem 7 gives at once the desired 


result. 


THEOREM 9. Let P=(y, Pers) be a pseudo-S;,, (k + 4)-tuple 
without diametral points. Then pipm = pjpn or 


COS (Pipm/’) + cos (pjpn/r) = 9, (iAm; 


for each pair pipm, pipn of the $(k + 8) (k +4) distances determined by the 
points of P. 


Proof. If one of the indices 1, m equals one of the indices j,n the theorem 
reduces to the preceding lemma. Suppose this is not the case. <Ac- 
cording to the lemma, pifm PpjPm OY COS (pipm/T) + COS = 9, 
and pjPm == PjPn OY COS (PjPm/T) + COs (pjpn/r) A consideration of the 
four possibilities thus presented leads at once to the theorem. 

We may now establish a characterization theorem for pseudo-S;,, sets of 


k + 4 points, no two points being diametral. 


t 
{ 


LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


TueoreM 10. If P is a pseudo-S;,r set of k + 4 points, py, 5 
no two of which are diametral, then for every pair of distinct points pi, p; of P, 
cos (pip;/r) = + 1/(k+1). The plus and minus signs are “ determinantally 
distributed” ; i. e., the signs occur in such a manner that the determinant 


(Pi, Pos’ * Pk:+4) — | cos (pip;/r) (4, ] 1, 2, k + 4), 
may, upon multiplication of appropriate rows and the same numbered columns 
by —1, be transformed into a determinant with each element outside the 
principal diagonal equal to —1/(k +1). 

In a recent paper’ it is shown in detail that this theorem follows from 
Theorems 6, 7, 8, and 9 of this section, and the argument need not be 
repeated here. 

First CHARACTERIZATION THEOREM. Let P be a pseudo-S,, set of more 
than k +-3 points, containing no diametral points. If p,q are any two distinct 
points of P, then cos (pq/r) =+1/(k +1). 

Proof. If P consists of exactly k + 4 points, the conclusion is warranted 
by Theorem 10. Suppose that P contains at least k + 5 points and select 
any k + 4 points of P containing the points p and q. By the Corollary to 
Theorem 8, these k + 4 points form a pseudo-S;, set and hence 


cos (pg/r) = + 1/(k + 1). 
LemMA. I[f pi, pj and q:,q2,° *,q; are two pseudo-Sy,r sets 


without diametral points, and 


then either 
PiPi = U4; 


cos (pip;/r) + cos (qiqj/r) = 9, 
Proof. Since the two sets are pseudo-S;,r sets, then 7 =k-+ 3. The 


two (i + 3)-tuples pj-r-2, 5 Pj ANA APE, 
by the Corollary to Theorem 8, pseudo-S;,, sets. It follows from the hypothesis 


of the lemma that 


Pj-k-2) Pj-k-19" 5 Pj-1 Qj-k-25 Yj-k-19° 


151,, M. Blumenthal, “ Metric methods in determinant theory,” American Journal 
of Mathematics, vol. 61 (1939), pp. 912-922. The paper referred to uses, as noted, 
Theorems 6, 7, 8, 9 of the present article, but merely states these theorems without 


offering any proof. 


| 

| 852 

| 

or 


\ 


| | 


THE CHARACTERIZATION OF PSEUDO-SPHERICAL SETS. 


and hence (Theorem 7) we have 


(Ai). PiPi = 
or 
(B,). cos (pip;/r) + cos (qiqi/r) = 0, 


Applying the same reasoning to the two (k + 8)-tuples 


Pj-k-35 Pi-k-1y" Pi-1y Pj ANA * * Vis 
we obtain 
or 
COS -+ COS = 9, 
(Bz). cos + COs =9,° °°, 
cos (pjpj-r/r) + cos = 0. 


It is now easily seen that if the alternative (A,) subsists, then the alterna- 
tive (A,) holds, while the validity of (B,) implies that of (B.). Thus the 
alternatives (A,), (B,) have been extended from i = j —k— 2, j —k—1,:--, 
j—1 to i=j—k—3, j—k—2, j—k—1,-+-,j—1. Continuing in 
this manner, the index 7 is made to recede to 1, and the lemma is established. 


SECOND CHARACTERIZATION THEOREM. Jel P be a pseudo-S;,r set of 
arbitrary power exceeding k + 3 and containing no diametral points. Then 
for every integer i> 1, the determinant A; formed for each set of 1% points 
(pairwise distinct) of P has, upon multiplication of appropriate rows and the 
same numbered columns by —1, all elements oulside the principal diagonal 
equal to —1/(k-+-1). 


Proof. Let * +, pi be any set of 1 > 1 pairwise distinct points 


of P. If ik -+ 4, the conclusion follows from Theorem 10. 


Case 1. Ifi<k-+4 then the i points fy, po,: - +, pi form part of a set 
of k + 4 points which, by the Corollary to Theorem 8, is a pseudo-S;,, set. 
By Theorem 10, the determinant Ay,, of these & +- 4 points has, upon multi- 
plication of appropriate rows and the same numbered columns by — 1, all 
elements outside the principal diagonal equal to —1/(k-+ 1). The de- 
terminant Ai(;, pi) being a principal minor of this determinant, 
is then transformed by these elementary operations to the form specified in 


the theorem. 


Case 2. If i>k+ 4 then by using the preceding lemma in the same 


853 
| 

»j—1). 
| 


854 LEONARD M. BLUMENTHAL AND GEORGE R. THURMAN. 


manner that Theorem 7% was applied to the proof of Theorem 10, the method 
utilized to prove the latter theorem may be adopted without change to establish 
the present theorem in the case under consideration. 

It is noted that requiring a pseudo-S;,, set to be free of diametral point- 
pairs (a condition that enters into all of the lemmas and theorems following 
Theorem 7) rules out pseudo-So,, sets from consideration since evidently each 
pair of points of such a set has distance dr. It is obvious, however, that 
even for these sets the conclusions of the above two characterization theorems 


are valid. 


4. Spheroidal and pseudo-spherical spaces. Characterization theorems. 
A semimetric space of finite diameter d and positive space constant (parameter ) 
p is called an n-dimensional spheroidal space SP provided Properties I-V 
of Section 2 are satisfied when the cosine function involved in these statements 
is replaced by any monotonic decreasing function ¢(pq/p), defined for each 
pair of elements p,q of the space, with ¢(0) —1, ¢(d/p) =—1. The S,,, 
as well as the same point set metrized with euclidean (chord) distance are 
examples of spheroidal spaces. It may be observed that spheroidal spaces 
arise as simple metric transforms of subsets of the S»,r. 

Since the derived properties (a)-(1) of Section 2 may be deduced from 
Properties I-V (and those properties of ¢ given above) they are valid in any 
spheroidal space. Thus each k-dimensional space 6% | has minimum congruence 
order k + 3 with respect to semimetric spaces. The characterization of pseudo 
sets of more than k + 3 points and free of diametral point-pairs is given by the 
two characterization theorems of the preceding section upon replacing the 
cosine function by ¢. Thus, in particular, pseudo sets for the sphere with 
chord metric are characterized by these theorems. 


THE UNIVERSITY OF MISSOURI, 
CoLUMBIA, MISSOURI. 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS.* 


By GERALD B. Huer. 


Introduction. In the geometry of planar Cremona transformations there 
are two important problems associated with the forms 


=2,? + +--+ —2,? 
and (lz) +--+ 3%. 


If a complete and regular linear system %p,a of plane curves of dimension d 
with the generic curve having genus p is of order xz) and has multiplicities 
* *, Lp at a set of p prescribed base poinis, then = {2% ; 2%, , Zp} 
is called the characteristic of Xp, and satisfies the Cremona equations: * 


(zr) = 1— d—p, (lz) =—1—d+p. 


In 1934, Coble * gave a method of determining every ordered solution of these 
equations for a given p, d, and p. However, a solution of the Cremona equa- 
tions may not determine any system %p,a and there has not yet been discovered 
any general criterion for distinguishing between proper, degenerate, and virtual 
solutions. (A solution*® is defined to be proper, degenerate, or virtual ac- 
cording as the generic curve of the system is (a) existent and irreducible, 
(b) existent and reducible, or (c) non-existent.) Coble gave criteria for 
p= 9,10 and certain p,d but a general criterion is still lacking. 

The second important problem in connection with (Jz) and (22x) arises 
from the fact that a linear transformation, 


, 
= — — —* * *—— UppLp, 


which gives the effect on x of a Cremona transformation with F’-points at the 
base points of S»2, must leave (xv) and (lx) absolutely invariant. Once 


* Received September 13, 1939. 

1A. B. Coble, “ Algebraic geometry and theta functions,’ American Mathematical 
Society Colloquium Publications, vol. 10 (1929), New York City. 

2A. B. Coble, “Cremona’s diophantine equations,’ American Journal of Mathe- 
matics, vol. 56 (1934), pp. 459-489. 


* Loc. cit., p. 461. 
855 


856 GERALD B. HUFY. 


again, the converse is not true. There are linear transformations of this form, 
leaving (Jz) and (xx) absolutely invariant, which do not represent any C. T.‘ 

For p = 8, the number of these linear transformations is finite, and they 
all represent Cremona transformations with 8 or less F-points. For p=9, 
the number is infinite but in 1932 Dr. Taylor ® showed that there are a finite 
number of types, each expressible in terms of certain parameters, and that 
they are all geometric; i. e. they all represent Cremona transformations. These 
results were later put in better form by Barber.® 

For p > 9, there is still no simple way of distinguishing a geometric linear 
transformation. At one time it was thought that it was sufficient that the 
numbers n, 7;, 8;, %j be positive or zero, but examples have been devised which 
show that this is not true.* 

In this paper the problem is studied by considering (2x) —0 and 
(lx) = 0 as loci in a projective space Sp. The most interesting result is the 
appearance of linear transformations of infinite period and simple algebraic 
properties. These transformations give a simple tool for obtaining results 
already known and also provide the answers to questions that have been raised 
in the literature. 

In the work the C-, P-, and D-characteristics (i.e. characteristics of 
(rx), (lz) =—1,— 3; 1,—1; and 2,0 respectively) play their usual im- 
portant role. Also elliptic characteristics of d=0, p=1 and d—1, p=2 
enter in the work for the first time. The invariant characteristic {3,17} will 
be designated by 7 and the fundamental P-characteristic {0; 0¢*— 1} by 6. 

The group of linear transformations leaving (rx), (lz) invariant will be 
denoted by G(R)p.2, G(L)p,2, or G(C)p,2 according as the elements have 
rational coefficients, integer coefficients, or represent Cremona transformations. 
The set of p points at which & is defined will be designated by Pp,2. 


1. Harmonic Perspectivities. The harmonic perspectivity in any point 
y not on (xx) = 0 and its polar Sp_1: (yx) = 0 has the equations: 


(1) = (yy)« — 2(yr)y. 


*G. B. Huff, “A note on Cremona transformations,” Proceedings of the National 
Academy of Sciences, vol. 20 (1934), pp. 428-430. 

5M. E. Taylor, “ A determination of the types of planar Cremona transformations 
with not more than nine F-points,” American Journal of Mathematics, vol. 54 (1932), 
pp. 123-128. 

®°S. F. Barber, “ Planar Cremona transformations,” American Journal of Mathe- 
matics, vol. 56 (1934), pp. 109-121. 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 857 


It will, of course, send points on (rz) = 0 into points on (zz) = 0. If it is 
to do the same with points on = 0, then either or (ly) = 0. 
In the first case the equations of the involution in the point / are: 


(2) a’ = (p—9)a— 2(Iz)l. 
It is readily shown that if this substitution is to leave (ax), (lx) absolutely 
invariant, it must be written in the form 


2(Ir) 


(3) 


For any value of p 9 this gives an involution which is an invariant member 
of G(£)p,2. Some of these have been noticed in the literature. p= 7, 8, 10, 11 
give elements of G(/)p.2 which are in G(C)p,2 for p= 7, 8. The involutions 
for p = 3, 6, 12, 15 have integer values for n, 7;, s; and have been studied also. 

If y is in (lz) =0, the equations of the involution must be written in 


the form 

2 
4 
(4) (yy) 


to insure invariance of (rr), (lx). For (yy) = 2, these are all elements of 
G(I)p,2 and are the involutions in D-conditions which have been studied in- 
tensely. If y is a geometric D-condition, the involution is in G(C) p>». 


For (yy) —=— 2, we have members of G(JZ)p,2 which are never in G(C)p,2 
since n = 1— y,” is never a positive integer for yo 40. The first integer y 
such that (yy) —=— 2 occurs for p= 11 and gives an interesting element of 
G(I) 


2. Pencils of Characteristics on a line. Related characteristics. We 
will say that two characteristics x,y are of the same sort if (lz) = (ly) and 
(cx) = (yy). This means that the associated linear systems, if any, will have 
the same genus p and dimension d. We may ask ourselves: under what 
conditions is 
(5) + py 


of the same sort as both # and y? Substitution yields: 
(6) z=aAr- py is of the same sort as both x, y if and only if (ay) = (xz) 
= (yy) andA+p—1. 

If two characteristics of the same sort satisfy the condition (az) = (yy) 
= (zy) we will say that they are related. 


11 


858 GERALD B. HUFF. 


(7) If 2 and y are two related characteristics, all characteristics in the form 


At + (1—A)y=A(t—y) + 
or (1—p)@+ py=n(y—2z) +2 


are of the same sort as @, y. 


Linear pencils of this type have been studied in detail in the literature. 
All D-conditions on Ps». are included in 120 such pencils. No related pairs 
of geometric P-curves or C-nets exist. Indeed, the condition that two be 
related is invariant under G(C)p,2 and it is evident that {0; 0%*} and {1; 0°} 
are not related to any geometric characteristics of the same sorts. Non- 
geometric pencils of related P-curves have been exhibited.? 

If x and y of the same sort are related, the line joining z,y is tangent 
to (rx) = 0 at a point in (Jr) =0. Hence the result given in (7) may be 
put in the slightly different form: 

(8) Characteristics of the form ka-+ y are of the same sort as y for all 
values of k if and only if (aa) = (la) = (ay) = 0. 

The pencils determined in (8) are the same as those given in (7). The 
difference is that in (7) we think of the pencil as determined by two char- 
acteristics of the same sort; and in (8) the pencil is determined by an elliptic 
characteristic a and a point y in the tangent Sp_, of (wr) = 0 at a. 

3. Systems of characteristics which lie on a plane conic. The pencils 
of characteristics obtained in 2 contained characteristics lying on a line de- 
termined by two points. We can obtain systems lying on plane conics by 
seeking the conditions on a and 0 that 

k?a kb + 


shall be of the same sort as a for all values of &. The equations 


(1, k?a + kb + x) = (Iz) 
+ + a, k?a +- kb + x) = (a2) 


regarded as identities in & yield: 


(a) (la) = (lb) =0, (aa) = (ab) = 0, 
and 
(8) 2(ar) + (bb) =0, (br) =0. 


If a and 6 are any two characteristics which satisfy (z), then 


7See reference p. 480. 


act 
iny 


t 

r 

7 

( 

Si 
Cr 
al 
of 
dle 
pl 
th: 
Sal 
th 
(1 
Fo 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 859 


a= da, b = pai + vb 


will also satisfy (~). Substituting these in (8) gives 


2a(ar) +2(bb) =0, + =0. 


From the second we must have 

v=ao(dz). 
Substituting in the first gives 

2v(ar) + = 0. 


If (dx) = 0, this is identically satisfied, v= 0, and we have a pencil of the 
type in (8). For (47) A0, we must have A=—4(bb)o? (ar). (It is 
readily verified that if 6 is an integer characteristic then 4 (bb) is an integer.) 


Thus, 


(9) If aand b satisfy (aa)=(la)=(lb)=(ab)=0 then all characteristics 
x” = — $(bb) (ok)? (ar)a + ok[ (ax) b — (br)a] +2 
are of the same sort as a. Moreover, every system of characteristics of the 


same sort and of the form 


+ kb +2 
can be obtained in this way. If (ax) = 0, the system lies on the line ka + a. 


This means that any elliptic point a and a point 6 on its tangent plane 
and (/7) 0 determine for any characteristic 2 a system of characteristics 
of the same sort as 7. If (ar) ~0, this system lies on a conic in the plane 
determined by a, b, and a In the next paragraph we will find interesting 


properties of these systems. 


4, The linear substitution S“,,. In the preceding section it was found 
that any elliptic characteristic a and a characteristic b satisfying (ab) = (Ib) 
=() determine with any characteristic « a system of characteristics of the 
same sort as 7 and lying in the plane determined by a, b, 7 This led to 
the equation : 


(ax)a + k[ (ar)b — (bx)a| +2. 


(10) = 
For a, b and k given, this is a linear substitution which sends any char- 
acteristic z into a characteristic 2’ of the same sort. It leaves (xx), (lr) 


invariant. For rational a, 6 and & it is always an element of G(R)p,2; for 


860 GERALD B. HUFF. 


integer a, b, and k it is in G(Z)p,2 and is sometimes an element of G(C)p». 
We will designate such a substitution by S*,,, and study its properties. 
The following properties are readily verified: 


a,b a,b a,b 7 a,b; 


The parameter & plays the role of an exponent and gives a simple law of 
multiplication for substitutions S*,,, defined for the same elliptic characteristic a. 
The simple algebraic form of (10) leads to the following theorems: 


(12) All lines through a which are tangent to (ax) =0 are left invariant 
by S*q,» for all b, k. 

(13) tf and only if and Ab = pa+ b. That is, two sub- 
stitutions are the same only if they are defined at the same point a and b,b 


lie on a line through a. 


(14) Jf x,y are two characteristics of the same sort, and an elliptic char- 


acteristic a exists such that 
(av) = (ay) =k 40, 


then b= (y—2)/k satisfies (ab) =0 and defines an Sa» which sends x 
into y. 

A given elliptic characteristic a and a suitable b determine a linear sub- 
stitution S,,, and its inverse Sy,» which generate an infinite “ cyclic ” group. 
On the other hand, a particular elliptic characteristic a defines an aggregate 
of elements Sz,» for all characteristics b satisfying 


(ab) = (lb) = 0. 


From the laws (11) it is evident that this aggregate constitutes an infinite 
abelian group. We will designate it by {So}. 


5. The condition that II, shall be a substitution S,,. To relate trans- 
formations S,,, to known elements of G(C)p,2 we investigate the conditions 
under which /,/,, the product of two harmonic perspectivities in D-conditions, 
may be such a transformation. As an algebraic tool we use the theorem.* 

The necessary and sufficient condition that a square matric M can have 


®Given in a paper by the author read to the Texas Section of the Association, 
May, 1936. 


per 


( 

I 

t 

0 

a 

fe 

St 

( 

is 

Or 
A 
(1 
is 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 861 


its k-th power a matrix whose elements are polynomials of degree n in k is 
that (M —I)"** =0. 


Obviously, the k-th power of the matrix of any transformation Sq,» has 
elements which are quadratics in k. Then if J¢Ja is to be such a transforma- 
tion, its matrix must satisfy (M—J)*—=0. Applying this condition, we find 
that we must have (cd)? 4 which means that the line joining ¢ and d is 
tangent to (xa) = 0 at the elliptic point c—d. That the condition is suffi- 
cient is shown by verifying that S¢a,¢=TIcla if (cd) = 2.° 


(15) If c,d are two distinct D-conditions, then IIa is a linear substitution 
Si» if and only if (cd)? =4. If signs are chosen so that c,d are related 
(i.e. (ed) = 2), then 

Tela = 


and II, and Igl, generate an infinite cyclic group. 


In 1933 Dr. Barber,® using purely algebraic methods, obtained a set of 
necessary and sufficient conditions that J-Ja and I7/¢ be permutable. From 
the present geometrical point of view it is clear that permutability is possible 
if either: 

(a) the line @d is in the polar Sp-z of the line cd with respect to (rr) = 0; 


or 
(b) the lines cd and @d are tangent to (rr) = 0 at the same point. 


Indeed, in the second case J,/, and /7/7 are transformations S,,, defined 
at the same elliptic point and permutable by (11). The algebraic conditions 
for these two cases are equivalent to Barber’s conditions and hence are neces- 


sary as well as sufficient. Thus 


(16) The necessary and sufficient conditions that IIa and Ila be permutable 
is thal either: 

the lines cd and 2d be conjugate with respect lo (xx) = 0, 
or the lines cd and éd be tangent to (ar) = 0 at the same point, 


A simple algebraic form is: 


(17) The necessary and sufficient condition that I-[q and Ila be permutable 
is that either 


° (cd) =2 is equivalent to (cd)*=4 since —e determines the same harmonic 
perspectivity and the same D-condition as c. 


| 


862 GERALD B. HUFF. 


= (cd) = (dz) = (dd) =0 
or (cd)? = (ad)? =4 and c—d=k(é—d), 


where signs are chosen so that (cd) = (2d) = 2. 


These conditions are simpler than those given by Barber. However, all 
conditions given there are necessary in a technical sense, because they are all 


consequences of these. 


6. Results in the case p=—9. For p=9 the Ss, (lx) =0 is tangent 
to (zx) = 0 at the elliptic characteristic 1. Any D-condition is on (lz) = 0 
and by (8) defines a pencil of characteristics kl+ d. These lines all meet 
= 0 in points which are D-conditions on Thus the lines of D-condi- 
tions determined by the D-conditions for which z, = 0 contain all D-conditions. 


“There are 120 of these. 
(18) All D-conditions for Py». lie on the 120 tangent lines 


kl +d 


where d is any one of the 120 D-conditions for which ry = 0. 


The one elliptic characteristic / defines a group {Si}. Any element is 
determined by a characteristic b satisfying (lb) = 0. If one choice b defines 
an element, then by (13) all 6 =kl-+-b define the same element. In par- 
ticular, for k = — by, there is a b in Zy = 0 which defines the element and 
only one of this sort. Every element of {S:} is determined once and only once 
by all characteristics b satisfying (Jb) —by=—0. Indeed, it is the infinite 
abelian sub-group dy. 


(19) {Si} is the infinite abelian sub-group dy and each element is given 
once and only once by Si», where (lb) = by = 0. 


Any two P-characteristics y, z satisfy (ly) = (lz) =—1, and hence by 
(14) (e—y)/—1=y—z isa b such that Si, sends y into z. 


(20) All P-characteristics on Ps, are geometric and any pair defines a unique 
element of dy which sends one into the other. The images of {0; 0®—1} 
under ay include all P-characteristics once and only once. 


dy is the integer group defined at 1. If we allow rational values of 6 
then we have a subgroup of G(#) 2. Under this group all C-characteristics are 
conjugate. Indeed, two C-characteristics y,z satisfy (ly) = (Iz) =— 3 and 
by (14) (y—-z)/3 is a rational b such that Sz, sends y into z. Ordinarily 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 863 


this would give an element with rational coefficients. It may happen, however, 
that they turn out to be integers. 

To state the conditions under which this occurs we note that 6 = 4 (kl 
+y—z) yields the same substitution as (y—z)/3 and is integral if 
y — 2==kl, mod 3. On the other hand, if z is the image of y under an element 
of dy, it can be shown (by writing down the explicit form of a general element 


of dy) that y— z=Kl, mod 3. 


(21) The necessary and sufficient condition that two C-characteristics y, z 


shall be conjugate under dy is that 
y — z=Khkl (mod 3). 


Now for every integer C-characteristic on nine points there is one on 
eight points which satisfies the condition of (21). A C-characteristic on nine 
points is geometric or not according as it is related under (21) to a geometric 
or non-geometric characteristic on Ps... This can be used to obtain the 
criterion obtained by Coble.” 

The substitutions with rational coefficients sometimes have a surprising 
form. For b = {8; 380}, S*:,, has the form: 


= (36k? + 1) x) —(12hk? + k) a, —(12k? + k) x. -—(12k? — 8k) 
a’, =(12k? — k) a, —(4h? — 1) — 4h?2, —- - -—(4k? — 3k) aq 

= (12k? — k) a, — —(4k? — —- -—(A4k? — 3k) 

as = (12k? — k) x, — — 4h?x, -—(4h? — 3k) ay 

ay = (12k? + 8k) —(4k? + 3k) a, —(4k? + (4k? — 1) 
For k = —1/3 this gives an element of G(#)9.2 whose cube is in G(C)o.2. 


The characteristic {n;ri} = {5; 1°84} is integral and indeed is geometric. 
But {n;s;} = {5; %°%— %} is a rational C-characteristic. This answers the 


conjecture ?° that integral {n; must require integral {n; 


7. Results for the case p10. For 9,02. the characteristic 1 again 


plays a particular role. In this case —1 satisfies the equations 

(l,—1l) =—1, (—/,—l1) =1, 
and defines a virtual P-characteristic. Moreover, any P-characteristic p satis- 
fies (lp) =—1 or (—Ip) and is related to —l. Hence for p= 10 
all P-characteristics lie on the tangent cone from —I to (rr) = 0. They 


may be obtained by joining all elliptic points a to —1. 


79 See reference ”, p. 489. 


864 GERALD B. HUFF. 


(22) All integer P-characteristics on Po are given by 
p=tka-—l 
where k is any integer and a is any elliptic characteristic. 


Now Coble showed * that any one of these could be reduced under G(C) 10,2 
to an irreducible P-characteristic of the form k(1+ 86) —1J, where 8 is the 
fundamental P-characteristic {0; 0°—1} and (1+ 8) = {3; 190} is the 
earliest geometric elliptic characteristic. It follows then that all elliptic char- 
acteristics are conjugate under G(C),» to 1+ 6 or a multiple of 7+ 6. In 
particular, 


(23) All elliptic characteristics of positive order and G.C.D=1 are geo- 
metric and reducible under G(C) 10,2 to 1+ 8 = {3; 1°0}. 


Combining this with (22) yields the result 


(24) All geometric P-characteristics are given by 
p=a—l, 


where a runs over all geometric elliptic characteristics. 


That is, on each line of the cone of all P-characteristics there lies one 
and only one geometric characteristic. That (24) gives a definite determina- 
tion of all geometric P-characteristics is clear when we recall that all elliptic 
characteristics are easily obtained by Coble’s method. 

At the particular elliptic characteristic a =1-+ 6 there is defined a group 
{Sa} which is simply isomorphic to ad). At any other elliptic characteristic 4 
there is defined a group {Sz}, which is simply a transform of {S,} under 
G(C) 10,2, for a and @ are conjugate under this group. 


(25) Every infinite abelian sub-group of type {Sa} generated by pairs of 
involutions in related D-conditions is geometric and indeed is simply 1s0- 
morphic to ay. Such a sub-group is defined for each elliptic characteristic and 
all such sub-groups are obtainable in that way. 


Dr. Barber obtained by experiment one of these defined at {4; 271°}. 
(25) gives a complete classification of sub-groups of this sort. 

1 plays a particular role in another sense. By (3), 1, the harmonic 
perspectivity defined by 7 and (lz) is a member of G(J)102 and has the 
equations 

Ty: 2l(Iz). 


| { 
if 

[ 


) 20,2 
the 
the 


In 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 865 


Coble discovered this involution in another way and named it Tyo. For Tyo, 
{n, ri} = {n; sj} = {—19; —6"°} and the P-characteristics are of the form 
—2(/+6) +8. That is, all its P-characteristics are irreducible.2 This 
naturally raises the question: is 7’;) the only element of G(J)10,2 with this 
property. Coble’s simple algebraic form for irreducible P-characteristics pro- 
vides an easy answer, since two P-characteristics must satisfy (pp) =0. If 


{3k; k8,—1,k} and {3h’; k’8k’ —1} 


are to satisfy this, then hk’ + k + hk’ =0 or (k + 1) (k’ +1) =1, which is 
true for integers only when i =k’ =0,— 2. Hence, 

(26) Any element of G(I)10,2 such that all its P-characteristics are irre- 
ducible under G(C) 40,2 is either Ty) or multiplied by an element of x. 


An important corollary is: 
(27) G(Z)10,2 is generated by w, and Ty. 

Another result is interesting to the writer in that it enables him to answer 
a question that has been in his mind for some time. It is known that 
N, Si, Tj, 4%; = 0 is not a sufficient condition that an element of G(J)p,» be in 
G(C’)p,2, the first example being devised for p= 11. However, the condition 
is sufficient for p= 9. The only case in doubt has been p10. From (26) 
we see that any element in G(JZ)10,2 but not in G(C)10,2 can be written in the 
form HT’, where F is an element of G(C)1o,2. By the algebraic form of 
T;. it can be shown that the n of such an element must be negative. 


(28) For p= 10 the elements of. G(I)p,2 which have n, ri, 8), = 0 are 
all in G(C) 10,2. For p= 11, this is no longer true. 
8. The case p—11. For / defines an element of G(/) 11,2, 
Ty: 


for which {n; s;} = {—10; —3"}. It sends any C or P-characteristic of 
positive order into one of negative order. Also, this is the first place a 
transformation J, is defined for (ly) =0, (yy) =— 2. The first case occurs 
for v == {4; 2 1'°} and has the equations 


(vz). 


Since (/v) = 0, and Ty are permutable and is an involution. Indeed, 
it is the de Jonquieres involution defined by the geometric net {6; 51°}. This 


| 
| 
ne 
la- 
ic 
p 
a 
er 
if 
| 


866 GERALD B. HUFF. 


example is interesting since it shows two elements of G(J) 11,2 which are virtual 
and such that their product is geometric. 

Another new type of involution is /,, the involution in the first non- 
geometric D-condition {3; 14°—1}. The product of 7,, and T,, is abelian 
and 7,7, —T7,). This means that the group generated by all transforma- 
tions I,; (yy) = + 2, must include 7, and 7,,. Or, Tio and 7; seems to 
give the new types of elements of G(/),,,2; so that weight is given to the con- 
jecture that is generated by Ajes, 7, Tio and 

Further, on P,,? the group {S,} defined at a= {3; 1°07} has as condi- 


tions on b: 


bio + = 0 
b,+b.+ + by + Dig + = 3b, b, + bo by = dbp. 


An element is determined then by 6 = {0; 0°1— 1} and its k-th power has 


the form: 


= (9k? + 1) x, — 3k72, — — + 
= — k*z, —(k? k? +hry 
19 =— 3k + ka, + ke, kay + Ly» 
= —kz, —kzr, — + 241. 


This infinite cyclic group furnishes whole families of irreducible P- and 
C-characteristics, including all irreducible P-characteristics for Pyo,.. That is, 


under G(C)1:,2 are no longer 


these characteristics which are “ irreducible 
irreducible under G(J)11,2. Sa,» is the product [.Ja where c is the geometric 
D-condition {0; 0°1,— 1} and d is the irreducible non-geometric D-condition 
{3; 1}. 

At the elliptic characteristic {4; 27150} a satisfactory b is {1; 1°0°1} which 
defines a cyclic group of elements in @(J*)i,,2. None of these are geometric 
nor are any of the P or C-characteristics geometric, for this group is the trans- 
form of the one above under an element of G(C) 41,2. 


Conclusion. The interesting point in this work is its unity and the 
directness permitted by the geometrical point of view. The invariant in- 
volutions (3) which are defined for G(/)p.2, p= 7, 8, 10, 11 and the trans- 
formations (4) for (yy) =—2 (which are always virtual) appear in the 
same way as the thoroughly studied involutions in D-conditions. Attention 


A GEOMETRY ASSOCIATED WITH CREMONA’S EQUATIONS. 867 


is drawn to pencils of elliptic curves of genus 2, which define these virtual 
involutions. Could it be that these virtual involutions may have some geo- 
metrical meaning ? 

The study of the systems of characteristics lying on a line and on a plane 
conic is important in that it leads to the linear substitution S,,,. The writer 
feels that the exceedingly simple laws which these satisfy should throw con- 
siderable light on the structure of the groups for p=11. In particular, 
Theorem (14) furnishes a simple sufficient condition that two characteristics 
be conjugate under G(J)p,. and for p=9 gives very simple results. The 
unity of the work would be increased if a simple geometrical definition of 
these transformations could be given. 

All geometrical P-characteristics for Ps. occur once and only once among 
the P-curves of the sub-group {Si} for p=9. From examples studied for 
larger p it seems possible that the aggregate of all geometric subgroups {S,}, 
defined at all geometric elliptic characteristics a@ might have this property. 
An affirmative answer to this conjecture would make the study of G(C)p» 
dependent only on the nature of the elliptic characteristics defined for that 
value of p. Thus elliptic characteristics may play as important a role in the 


general theory as C, P, and D-characteristics. 


SOUTHERN METHODIST UNIVERSITY, 
DALLAS, TEXAS. 


Diy 


POLYNOMIALS WHOSE REAL PART IS BOUNDED ON A GIVEN 
CURVE IN THE COMPLEX PLANE.* 


By A. C. ScHarrrer and G. 


Introduction. 1. In what follows we denote a rational polynomial of 
the complex variable z by z, if the degree of this polynomial is n. 

As a simple consequence of the theorem of S. Bernstein on trigonometric 
polynomials, the following holds: ' 

A. Let f(z) be a mp, and | f(z)|S1 in |z|S1. Then | f'(z)| Sn 
in |z| 1, with the equality only if f(z) =e", | «| =1. 

This theorem has been generalized by Szegé in two different directions. 
First, the unit circle may be replaced by a Jordan curve subject to certain 
restrictions : * 

B. Let T be an open or a closed Jordan curve consisting of a finite 
number of analytic arcs which join so that the exterior angle* is greater than 
zero. If f(z) is a mn satisfying | f(z)| 1 on T, then at any point z of T 


| S An* 


Here A is a constant which depends only on T and 2, and am is the exterior 
angle of T at z. The order of this bound as n becomes infinite can not be 


improved. 
On the other hand, the condition | f(z)| 1 in Theorem A can be 


replaced by | Rf(z)| <1, so the following is true: * 


C. Let f(z) be a m, and | Rf(z)| Si in |z| Sl. Then | f'(z)| Sn 
in |z| 1, with the equality only if f(z) =e2", |«| =1.° 


* Received April 22, 1940. 

1M. Riesz, “ Eine trigonometrische Interpolationsformel und einige Ungleichungen 
fiir Polynome,” Jahresbericht der Deutschen Mathematiker-Vereinigung, vol. 23 (1914), 
pp. 354-368. See also, O. Szdsz, “ Korlatos hatvanysorokrél,” Mathematikai és Termé 
szettudomanyi Ertesit6, vol. 43 (1926), pp. 504-520. 

2G. Szegé, “ Uber einen Satz von A. Markoff,’ Mathematische Zeitschrift, vol. 23 
(1925), pp. 45-61. 

®In case I’ is a closed Jordan curve, the exterior angle at any point of I’ is defined 
as usual. If I is an open curve, the exterior angle is defined as in loc. cit.?, pp. 48-49. 

*G. Szegié, “Uber einen Satz des Herrn Serge Bernstein,” Kénigsberger Gelehrte 
Gesellschaft, Naturwissenschaftliche Klasse, 1928, pp. 59-70. Also, S. Bernstein, “ Sur 
un théoréme de M. Szegi,” Prace Matematyczno-Fizyczne, vol. 44 (1937), pp. 9-14. 


868 


POLYNOMIALS WHOSE REAL PART IS BOUNDED ON A GIVEN CURVE. 869 


2. The main result of the present note is: 


THEOREM 1. Let T beaclosed Jordan curve consisting of a finite number 
of analytic arcs which join in such a way that the exterior angle is always 
greater than zero and less than 2x. Let f(z) be a an satisfying 


(1) | Rf(z)| S1, 
Then at an arbitrary point z of T 


(2) | f(40)| S Ans; 


here A is a constant which depends only on T and 2, and am is the exterior 
angle of T at z. The order of this bound as n becomes infinite can not be 
improved. 


This is a generalization of Theorem C, at least so far as the order of the 
bound of | f’(z)| is concerned ; for if T is a circle, the exterior angle at every 
point is z so that a 1. Incidentally, in this special case our general method 
used in § 2, furnishes the inequality 


|2z|/S1. 


For closed Jordan curves, Theorem 1 is an obvious extension of Theorem 
B which was obtained under the more restrictive hypothesis | f(z)| <1 on T. 
Theorem B, however, holds even if T is an open arc, while our Theorem 1 
does not. Indeed let T be the real segment —1=2=-+ 1 and consider the 
polynomial f(z) =iKz, z—=«-+ iy, K real. In this case Rf(z) =O on T, 
but | f’(1)| can be arbitrarily large. More generally, we can take for T a 
Jordan are along which the real part of a certain given polynomial is constant. 

Our proof of Theorem 1 makes use of the theory of conformal mapping 
and in particular of the theorems of Osgood-Taylor® concerning the behavior 
of the map-function near the boundary. (See however, the last remark in § 2.) 


3. Under the conditions of Theorem 1 we may ask for proper bounds 
for the “oscillation” of f(z) in I, that is for the maximum of | f(z) —f() | 
if z, and 22 describe, independently of each other, the closed interior of I. 


THeorEM 2. Let T have the same meaning as in Theorem 1 and let 
f(z) satisfy the same conditions as there. Then for two arbitrary points 2, 


and z, in the closed interior of T, 


(3) | f(a) < A log n, 

5W. F. Osgood and KE. H. Taylor, “ Conformal transformations on the boundaries 
of their regions of definition,’ Transactions of the American Mathematical Society, 
vol. 14 (1913), pp. 227-228. 


870 A. C. SCHAEFFER AND G. SZEGO. 


here A depends only on TY. The order of this bound as n becomes infinite can 
not be improved. 


Let f(z) be real at a certain (not necessarily fixed) point in T; then 
from (3) 


follows. 


(z)| <A logn, zeT, 


Theorem 2 is well-known for the case in which [ is the unit circle. The 
much discussed example 


f(z) = (72) (2/1 + 2/72 +: 2"/n) 


shows that logn is the true rate of growth of the bound in (3) or (3’) 
[f(0) = 0] in case is the unit circle. 

Theorem 2 has a more elementary character than Theorem 1; therefore 
we found it convenient to bring its proof first. Having proved both inequali- 
ties, we discuss the precision of our estimates as n becomes infinite. Obviously 
Theorem 2 combined with Theorem B furnishes the less informative bound 
An* log n instead of the bound in (2). 


In the proofs of both theorems we use the following 


Lemma 1. Let T satisfy the conditions of Theorem B and let f(z) bea 
satisfying | f(z)| We denote by a function which maps 
the exterior of T onto the exterior of the unit circla in such a way that the 


, 


noints at infinity correspond. Then at a point 2 outside T 
y 


(4) [f(7)| Sl 
Here |y(7)| =R>1. 


This is a well-known consequence of the maximum principle. 


1. Proof of Theorem 2. 1. Let I satisfy the conditions of Theorem 2, 
and let B > 0 be the smallest interior angle at which two arcs of [ join. If 
Z) is any point on I, we draw through z, two line segments LZ, and L, with 


the following properties: 


(a) 2 is one end-point of ZL, and of L.; 

(b) the other end-points of Z, and ZL. are also on T, whereas all other 
points of ZL, and LZ. are in the open interior of T; 

(c) at 2, L, and L, intersect T (or one of the arcs of T if 2 is a vertex) 
with an angle of 8/4. 


The distance of any point Z on L, or L, from T (that is, from any point 
z on T) is at least sin (8/8) times the distance from Z to 2 provided Z is 


POLYNOMIALS WHOSE REAL PART IS BOUNDED ON A GIVEN CURVE. 871 


sufficiently near to z. We determine the largest segments L’, and L’, on Ly, 
and L, respectively, having z) as one end-point, for whose points Z the condition 


Z z | sin (B/8), zef, 


(5) 
is satisfied. In what follows, let L = L(z,) denote the larger of the segments 
L’, and L’, (or either of them if they are equal). The length 1(z) of this 


L(%) has a positive minimum, say Ij, as Z runs over T. 
2. The following statements are essentially known: 


Lemma 2. If the real part of an analytic function F(z) is bounded by 1 
in the open interior of a circle of radius r > 0, then at the center 2 of this 
circle 
(6) 

LemMMaA 3. If the real part of an analytic function F(z) is bounded by 1 
in the unit circle |z| <1, then 


(7) | F(z) —F(0)| <2 log ——— 


Lemma 4. Let S be a segment of length s. If f(z) is a my satisfying 
|f(z)| S1 on S, then | f(z)| SK provided z lies within a distance n~ of 
either end-point of S. Here K is a constant which depends on s but is 
independent of n. 

Inequality (6) may be obtained by differentiating Poisson’s integral which 
for a circle of radius p, p <r, with center at the origin is 
+ 


F(z) S{F(0)} + 5 (pet?) | 2! <p. 


Inequality (7) may be obtained from (6) by integrating along a radius from 
0 to z: 


P(0)|=| f |= 
79 


which is (7). Lemma 4 follows from Lemma 1 of the Introduction where w(z) 
is a function which maps.the exterior of S onto the exterior of the unit circle 
with the points at infinity corresponding. In case S is the segment (—1, +1) 
of the real axis we need only-note that | ~(z)!" = | z+ (2—1)* |" is bounded 


if |z—1|Sn? or |z+1/Sn. 


| F(z) 


3. Now we proceed to the proof of Theorem 2. Let ¢, be a fixed interior 
point of T and F(z) an analytic function (not necessarily a polynomial) 
satisfying the condition | RF (z)|=1, Then, according to Lemma 3, 


| 
> 


872 A. C. SCHAEFFER AND G. SZEGO. 


1—| $(z)| 
where w = $(z) is a function which maps the interior of I onto the circle 
|w|<1 such that $(£,) =0. If & is a small positive number, let Ag be 
the set of points inside T which lie at a distance 6 or greater from T. Let 
= /,sin (8/8); then the segments Z drawn through each point 2 of T 
according to the former construction, extend into As. Furthermore, from (8), 


(8) | F(z) — F(t.) | log zeT 


(9) | F(z) —F(&)| SB, zeAs; 


here B is a constant which depends only on I and &. Now let zp be a point 
on I‘ and let z, be the end-point of L different from 2; then z,¢€ A>. If ¢ is 
any point of L, (5) and (6) imply that 


| S 2{| £—20| sin (8/8) 
This, together with (9), shows that if z is any point on L, 


(10) | F(2) < Clog 


Z— | 


where C’ is again a positive constant which depends only on I and £p. 

So far the polynomial character of our function has not.been used. Now 
let F(z) = f(z) be the z, of Theorem 2. In the portion of Z which lies at a 
distance greater than n-? from Zz, (10) shows that 


| f(z) S Clog (Cn?). 


Since the length of ZL is greater than a fixed positive number 7,, Lemma 4 


implies that 


| f(z) —f(&)| S KC log (Cn?) 


where K depends only on [. This completes the proof of Theorem 2 because, 
| F(a) —F(22)| S | — | + | |. 


2. Proof of Theorem 1. 1. Let z be a point of I at which two arcs 
yi and y. intersect with an exterior angle ar, 0< a< 2. It is no loss of 
generality to suppose that z) 0 and that the tangents to y, and y. at 7 =0 
intersect the real axis at angles of a7/2 and —am/2, respectively; also we 
may assume that a neighborhood of the negative real axis near the origin lies 
inside [T. With p a small positive number draw two circles of radius p, the 
first with center at pexp {1(a-+1)z/2} and the second with center at 
p exp {—1(%-+1)7/2}. These two circles will intersect at the origin, where 
they are tangent to y, and y., respectively, and at the point 


| 
| 
f 


POLYNOMIALS WHOSE REAL PART IS BOUNDED ON A GIVEN CURVE. 873 


z= 2p cos {(@ + 1)2/2} 


on the negative real axis. The are of the first circle which lies above and on- 
the real axis, and the are of the second circle which lies beneath and on the 
real axis, together form the boundary of a region R which is closed and simply 
connected and whose boundary touches T at z —0. All other points of R 
will lie inside T if p is small enough, and the exterior angle of R at 2 is am. 

If « = 1 the two ares which form the boundary of F# are arcs of the same 
circle, and F# is the circle |z+p|Sp. 

If 8 is a small positive number, let A; be the region obtained by trans- 
lating FR a distance 6 to the left; that means ze #5 if and only if (2+ 8)eR. 
Now we show the following. If p is small but fixed, then for all sufficiently 
small § the region Fs will lie entirely inside P and at a distance at least Ad 
from T where A is a positive constant independent of 8. Indeed let us repeat 
the previous construction of # replacing p by 3p; the resulting region S§ is 
bounded by two arcs of circles of radius 3p with centers at 


3p exp {+ 1(a + 1)2/2}; 
choose p so small that S§ lies entirely in the closed interior of T. Now fix p; for 
0<8<p| cos {(a+1)7/2} | 


one shows by direct calculation that Hs lies inside § and at a distance greater 


than 


38 | cos {(@ + 1)2/2} | 
from the boundary of S, and so inside T and at least this distance from I. 


2. Let f(z) be a polynomial satisfying the conditions of Theorem 1. 
We conclude from (6) that if z is any point of R5 


(11) | f?(z)| S 2/(r8). 


Let ¥(z) be a function which maps the exterior of R onto the exterior of the 
unit circle with the points at infinity mutually corresponding. Then y(z-+ 3) 
maps the exterior of Rs onto the exterior of the unit circle, and we obtain 


from (11) by application of Lemma 1 [cf. (4) ] 
(0) | S €2/(A8)} | |" 


But from a theorem of Osgood-Taylor mentioned in the Introduction [see *] 
we conclude that near the boundary point z= 0 of the map-function 


must be of the form 


w(z) =w(0) + 21/“p(z) 


12 


874 A. C. SCHAEFFER AND G. SZEGO. 


where | y(0)| 1 and p(z) approaches a finite limit not zero as z approaches 
zero. Then if | p(z)| SA for smali | z |, we obtain 


Placing 6 = n~* (permissible for large n) this gives 
| f’(0)| eA 


which proves the theorem. 

We notice that the map-function y(z) of the region R may be calculated 
in terms of elementary functions. This makes it possible to avoid the use of 
the Osgood-Taylor theorem. 


3. Discussion of the precise order. 1. The bound An* in Theorem 1 
is of the precise order as n > « ; this follows from the corresponding fact in 
Theorem B. 

We show that the bound A log n in Theorem 2 is also the precise one. 
More exactly, let Ty be a closed region in the open interior of T, z, a fixed 
point on I and 2, arbitrary in T,; we construct a sequence {gn(z)} such that 
gn(Z) is n=—1, and 

| Rgn(z) | =z i, zeT’, 
| gn(%1) — gn(22)| > A’ log n; 


here A’ > 0 is independent of n. 


n 
2. By use of the polynomials > z’/v this construction is rather easy in 


2. 
case a circle through z, exists containing T. The following method holds 
generally. The principal tool is Faber’s polynomials f,(z), n = 1, associated 
with fT. They are defined as follows. Let 

= cz + 


(12) c> 0, 


be the conformal mapping of the exterior of I onto the exterior of the unit 
circle | w | > 1, uniquely determined by the condition c > 0. Then fn() is 
defined as the “ principal part ” of {y(z)}", that is * 

Here the integration is extended over a curve C enclosing T (and over the 
corresponding curve in the w-plane), and z is in the interior of C. For the 
construction mentioned we need the following expansion (slightly different 
from the expansion in *, p. 54, (17)): 


N 


® See 3, p. 53. 


POLYNOMIALS WHOSE REAL PART IS BOUNDED ON A GIVEN CURVE. 875 


(14) log W = log ° 


Here the determination of the logarithms is obvious; and z is arbitrary. But 
|W|>41 if z is in the interior of T, and |W| >| y(z)| if z is in the 
exterior of 

Expansion (14) is clear for W=«. The differentiated expansion 


follows from (13). 

3. Let z be arbitrary in the closed interior of I, and let | W| > 1. 
Then the imaginary part of (14) is uniformly bounded (see *, p. 54) so we 
have for the Cesaro means of first order 


| m=1 


zeT,|W| 21, 


where Q depends only on T. Also, the function (14) is bounded for |W|=1 
and for a fixed z in the open interior of [T (uniformly if z is restricted to a 
closed region I, entirely in the open interior of T) ; that is 


(17) | > = ) | =, 


where depends only on and 


4, Let z, be an arbitrary point on with the exterior angle a7, 0 << a4 < 2, 
and let w: =y(4), | w:|—1. Assuming for a moment that T is a closed 
polygon, we find by use of the Schwarz-Christoffel formula 
(18) (W) —y(w,) = (W — w,)*F (W— w,) 
where F(t) is analytic around {—0, and F(0) ~0. This furnishes, if 
| W—w, | is sufficiently small, 


Now, 


wn 
IW. 
on 


For the line of integration we choose two arcs c, and ¢.; ¢, connects two 
points w’ and w” of the unit circle (on opposite sides of w,) and runs entirely 


876 A. C. SCHAEFFER AND G. SZEGO. 


in the exterior | W | > 1 of the unit circle; other are is the “ large ” 
are w’w” of the unit circle | W | = 1. We choose w’, w”, c, so that the func- 
tion F(W—w,) is regular and 0 in the domain bounded by the “small ” 
are of the unit circle and by 

In the first integral of (20) we can replace c,; +c. by the unit circle 


|W |==1; the resulting integral approaches 0 as n—> ©, according to 


Riemann’s lemma. The second integral is aw,", so 
(21) fn(@) = aw," + 0(1), n—> ©, 


5. We define the required polynomials by 


ont n J mw,” 


According to (16) and (17) 
But according to (21), 


(23) gu(%1) =i +0 (log n) 


This shows that the bound A log n in Theorem 2 is of the right order as n> ©. 


6. Finally we remove the condition that T is a polygon. If 2, is given, 
we construct a polygon I” with the following properties: 


(a) I” contains T; 
(b) z isonT; 
(c) the exterior angle of I’ at z, is a7, 0< < 2. 


Obviously there is no difficulty in constructing such a polygon IY so long as 
a 
Repeating the previous consideration for I’, we obtain a sequence of mn 
satisfying conditions (22); instead of (23) we have 


which suffices for our purpose. 


STANFORD UNIVERSITY, CALIFORNIA. 


P 


di 


| | 
| 
| 
| | 
| 
| 
r 


NEUER BEWEIS EINES SATZES VON G. H. HARDY UND 
S. RAMANUJAN UBER DAS ASYMPTOTISCHE 
VERHALTEN DER ZERFALLUNGS- 
KOEFFIZIENTEN.* 


VoN VosIsLAV G. AVAKUMOVIE. 


Wird mit p(m) die Anzahl der verschiedenen Zerlegungen von n in gleiche 
oder ungleiche positive ganzzahlige Summanden bezeichnet, so ergibt bekannt- 
lich die Hardy-Ramanujansche asymptotische Entwicklung von p(n) in erster 
Anniherung die Formel 


1) p(n) ~ [2 =n |, 


Kinen Beweis dieser Formel habe ich auf Grund allgemeiner Tauberscher 
Siitze funktionentheoretischer Art im Sections-Vortrag “Uber das Verhalten 
Laplacescher Integrale an der Konvergenzgrenze u.s.w.” /2. Congr. Inter- 
balkan. des Math. Bucarest, 12-I1X-1937. Bull. Math. Soc. Roum. Sci. 40, 
Nr. 1/2 1938, S. 101-106/ gegeben.* 

Im folgenden mochte ich mit der im Prinzip gleichen Methode die Formel 
I) auf moglichst kurzem Wege beweisen. 

1) Fir R(s) > 0 ist 


also, 
(1) A(u)=p(n) fir nSu<cn+l, (n == 0, 1,2,° -) 
gesetzt, 
(2) f (u)du (+=**) gta). 
Sei 


0 fir OS u < 1/24 
=< 1 fiir 1/24 Su < 25/24 
0 fiir 25/24 u 


* Received April 29, 1940. 
1G. H. Hardy and S. Ramanujan, “ Asymptotic formulae in reget analysis,” 
Proceedings of the London Mathematical Society (2), vol. 17 (1918), pp. 75-115. 
*Den Beweis eines Specialfalles dieser Satze habe ich in der Note: “ "Théorémes 
relatifs aux intégrales de Laplace sur leur frontiére de convergence,” C. R. de VAcad. 
des Sci. Paris, vol. 204 (1937), pp. 224-226 skizziert. 
877 


= 
) 
] 


VOJISLAV G, AVAKUMOVIE. 


Dann ist 


e*"B(u)du 


(exp[ 
( (exp[7*/6s] —1), 


q 


was zusammen mit (2) 


(w) — B(w) }du 


( g(s) — = (exp[x?/6s] — 1) 
= J(s) 


ergibt. Auf Grund der fiir die elliptische Modulfunktion g(s)  giiltigen 


Funktionalgleichung 


g(s) = V 8/20 exp[— s/24 + 


sieht, man, dass 


im 


bei festem a fiir jedes « > 0 eine im Intervall (— «, + «) gleichmissig in « 
beschrinkte und absolut integrable Funktion darstellt. Also ist 


a (¢ + + dt 
-0O 
+00 +00 
= af ef A (uw) — B(u)}du exp[— + 2ai(x — u)t 
0 
=aVr fou (u) — B(u)} exp E eu — a” au/ 


woraus 


(3) {A(u) — B(u)} exp [—« Vi 


+00 . 
=a/Var f erarit + 2ait) dt 
-00 
folgt, da im Integral rechts wegen lim sup | A (u)— B(w) | exp[—8u] < const. 
U=% 


(fiir jedes 8 > 0) der Grenziibergang «— 0 erlaubt ist. Wegen 


| 
| 
| 
| 
‘ 


dt 


NEUER BEWEIS EINES SATZES VON G. H. HARDY UND 8S. RAMANUJAN. 879 


(u — 1/24)"-*/2 — (u — 25/24)»-1/2 
+ 1)P(v—1/2) (v—1/2) 


4V/3u 6 


2 
a B(u) exp | — ar 
Je 


~ Ve exp [ | 0 


B(u)=1/V 2x > 25/24) 
p=1 


ist 


so dass aus (3) schliesslich 


(4) af A(w) exp Vu 
0 


—— exp | + |, 
6 


folgt.® 


2) Mit ist 


1+ 0(1) = exp[— exp [2 y—2 ]) 
) 


2 al %~ (2 — u)* du 
x dy V3 exp] — 27 
also 


(5) lim sup A(y)4 V3 y eXp 2 V5 y | 


Y=OO 


— Vaexp[2/24a? — Vr*6a] 


f _ edu 
-Va 


3) Sei Q= Q(t) die kleinste nicht-negative, nach (5) stets vorhandene 


Funktion, fiir die 


1 
4V3t 6 
ist. Wird zur Abkiirzung 
= Min Q(t) 


gesetzt, so folet wegen (4) und ~ 


> Bei «> strebt das in (3) rechts stehende Integral als Fourierconstante einer 


absolut integrablen Funktion gegen 0. 


st. 


880 VOJISLAV G. AVAKUMOVIC. 


o0(1) =z exp [—2 af (1+ 9) 
X exp [ 2 —A(u) [— a? du/Vu 


2 2 2 
exp [— 2 a exp [ 2 u—a?’ 


— £ exp | | af exp [— a? du/Vu, 


a-Va2/a 


also 


lg? 
(6) lim inf exp [—2 | 


0 
f _exp[ V 77/6 u/a — u*|du 
 -Va 


© 
f _e“du 
-Va 


4) Aus (1), (5) und (6) folgt I). 


= W.(a) > 1,45 


UNIVERSITY, MATHEMATICAL SEMINAR, 
BEOGRAD, YUGOSLAVIJA. 


| 
| 
| 
| 
| 
I 
f 


AN ALGEBRAIC PROBLEM INVOLVING THE INVOLUTORY 
INTEGRALS OF LINEAR DYNAMICAL SYSTEMS.* 


By JoHN WILLIAMSON. 


Introduction. In what follows f=—f(x), g = g(a) ete. are scalar func- 
tions with continuous second derivatives and are not constant in the z-domain 
under consideration. The point (a, is a point of the 2n- 


dimensional phase space 
Li = Pir = Yi; (t= 1,2,---+,n), 


where, with the usual notation of dynamics, the qg; denote codrdinates and the 
pi denote momenta. 

Two functions f(z) and g(a) are said to be in involution, if the Poisson 
parenthesis," 


fa og of Og\ 4 of dg 


vanishes identically. On denoting by G the skew symmetric matrix, whose 


+28 


where F is the unit matrix of order n, (1) can be written in the more 


compact form 
(2) = 0. 


square is the identity, 


In (2) fe and g, denote the gradients of f and g respectively and (f,)’ the 
transposed of the column vector f,. A set of m forms f;, f2,: - -,fm is called 
an involutory system, if any two of them are in involution, i.e., if (fi, fj) =09, 
i,j =1,2,---+,m. One can readily verify that m is less than or equal to n, 


if the involutory system consists of independent functions,” independent in the 


* Received March 18, 1940. 

Cf. e.g. E. T. Whittaker, Analytical Dynamics (Cambridge University Press 
(1904), page 288. 

*Let A’GA = 0, where @ is non-singular of order 2n and A is of rank r, Then, if 

E 0 ro 

PAQ=B= . , where F. is the unit matrix of order r, B’SB = 0, where G = P’SP, 
Hence, if S= (8;; 
fore § is non-singular, r is less than or equal to n. 


), =1,2,.- -, Qn, = 9, i,j =1,2,.--,7. Since G and there- 


881 


| 
uy? /2 


882 JOHN WILLIAMSON. 


ordinary function sense, and that the maximum value n of m may be obtained 
for suitably chosen independent functions. 
If h=h(zx) is the Hamiltonian function of a conservative dynamical 


system, with the above notation we may write * 


(3) Gz = he, 
where = dz/dt. 


If f is a conservative integral of (3), (fc)’a is identically zero in z or, 
since G — G", by (3) 
(4) (fo)’Ghs=0. 


Conversely, if (4) is satisfied, f==f(a) is a conservative integral of (3). 
Hence the m functions 
fia=h, fo. fm 

are m conservative integrals in involution, if, and only if, 

(5) (fi,)’Gfi, = 9, (1,7 = 1, 2, »m). 
It is known that m =n, but not more than n, conservative integrals in in- 
volution may be chosen to be independent in the functional sense mentioned 
above. 

There remains the question: what becomes of these analytical facts in 
case the dynamical system is linear, i.e. if h h(a) is the quadratic form 
4c’Hx, where H is an arbitrary, but not zero, 2n-rowed symmetric matrix, 
representing the Hessian of h. In this case the Hamiltonian system appears 


in the simplified form 


(6) Gz = Hz. 

Further the quadratic form f = 42’Fx is by (4) an integral of (6), if, and 
only if, 

(7) = 0. 


Equation (7) is however equivalent to 

FGH + 0, 
or, since F and H are symmetric and G skew symmetric, to 
(8) FGH = HGF. 


Similarly the m quadratic forms, which belong to the symmetric matrices 


* Aurel Wintner, “On the linear conservative dynamical systems,” Annali di 
matematica pura ed applicata, ser. 4, tomo 13 (1935-36), pp. 105-112. 


= 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 883 


F,=H,F,,:-+,Fm are m quadratic integrals of the system (6), forming 


an involutory system, if, and only if,* 
(9) FGF; = F;GFi, (4,7 = 1,2,- -,m). 


It is understood that all the matrices F; are distinct from zero but are not 
necessarily non-singular. 

By the general theorem, mentioned for non-linear systems, there always 
exist m= independent integrals in involution for the linear system (6). 
The conjecture was made by Professor Wintner that, in the case of a linear 
system, these mn integrals may be chosen to be quadratic forms. The 
main purpose of this paper is to show that this conjecture is correct—that for 
every 2n-rowed non-zero symmetric matrix H, there exist » symmetric matrices 
=H, +, Fn, which are independent and satisfy the involutory con- 
dition (9). It is understood that independence is now meant in the algebraic 
sense, 1. e., that the corresponding quadratic forms are functionally independent. 

By the general theory there always exist 2n—1 integrals, which are 
independent, and n — 1 may be obtained, by a theorem of Liouville from the 
n independent integrals in involution by means of quadratures and elimina- 
tions.” In the linear case it is possible that some of these n — 1 integrals may 
also be quadratic; in fact, if the minimal equation of HG" is of degree 2m, 
this number is /==n—m and, if the degree of the minimal equation is 
2m —1, the number is l=n—m-+1. The remaining n—1I—1 must 
then be determined by local elimination processes, which seem to lie outside 
the scope of an algebraic treatment. 

It was found advisable first to determine the linearly independent quad- 
ratic integrals, a comparatively simple process; and then from them to de- 
termine the quadratic integrals independent in the more general sense. This 
was accomplished by the extensive use of linear differential operators, similar 
to the Aronhold operators of classical invariant theory. In §6, when J is 
singular, the linear integrals of the system (6) are determined. 

In the final section it is shown that the dynamical system, corresponding 
to the equations of variation of the small vibrations about an equilateral 
Lagrangian libration point in the restricted problem of three bodies, has, for 
all values of the masses, in addition to the energy integral only one quadratic 
integral; and this integral is determined. 

The methods employed throughout the paper are purely algebraic, and the 


* Aurel Wintner, loc. cit., page 108. 
5K. T. Whittaker, op. cit., page 311. 
°E.g., L. E. Dickson, Modern Algebraic Theories, Chicago (1926), pp. 25-27. 


884 JOHN WILLIAMSON. 


proofs, to a large extent, are based on results, previously proved by the author,’ 
which give normal forms for a pencil of matrices, whose base consists of a 
symmetric and a non-singular skew symmetric matrix. These results, for 


convenience of reference, are given in § 1. 


1. Normal forms. If // is a symmetric and G a non-singular skew- 
symmetric matrix, we shall say that the pair A, B is equivalent to the pair H, G, 
if there exists a non-singular matrix P, such that 


PHP’=A and PGP’ =B. 


In normal form the matrices A and B of the pair equivalent to H, G are simi- 
larly partitioned diagonal block * matrices 


A= [A,, Az, ° j Hel, B= [B,, Ba, 


the blocks being determined by the elementary divisors of the matrix pencil 
H—zxG. The elementary divisors of this pencil are subject to the following 
restrictions;*® if («—a)", a0, occurs s times amongst the elementary 
divisors of the pencil, then so does the elementary divisor (2+ a)" and the 
elementary divisor x’, where r is odd, if it does occur, must occur an even 
number of times. Since the field of operations is the real field, there are four 


distinct forms for the matrices A; and Bj. 


Type (a). The pencil A—2zB has the single pair of real elementary 
divisors (x—p)", (x+ p)". Then’ 


0 0) 0 L; 


where 
(11) Lj = pE, + 


In (11), #, is the unit matrix of order r and U, the auxiliary unit matrix 
of the same order. In particular, if r is odd, p may be zero. 


Type (a,). The pencil A—~2zB has only the four elementary divisors 
(A+a+1b)", b0. The matrices A; and B; are still determined by (10) 


7 John Williamson, “On the algebraic problem concerning the normal forms of 
linear dynamical systems,” American Journal of Mathematics, vol. 58 (January, 1936), 
pp. 141-163. The general field K is now the field R of all real numbers and the 
particular results now required are given in § 5, pp. 161-163. 

*The matrices A; and B; are square matrices of the same order. 

® John Williamson, loc. cit., page 145 and page 162. 

10 John Williamson, loc. cit., page 158, formula (59). 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 885 
and (11), if each unit is replaced by the two-rowed unit matrix, each zero by 
a b 
the two-rowed zero matrix and p by the matrix ; ; 
—ba 


Type (6). The pencil A — 2B has only the two pure imaginary divisors 
(v2 + 1b)". Then*! 


where e is the unit matrix of order 2, and 


(13) 


(14) 
0 0 0 1) 
0 0 at 

15 

(15) 0 0 


((—1) 0 . 0 0 


Type (b,). The pencil 4; — 2B; has the single elementary divisor 2°". 
Then A; = U2, and B; = Xo. 

For later purposes we require the following. If r is even, X, is skew 
symmetric and, if 7 is odd, X, is symmetric and therefore for all values of r 


the matrix Bj in (14) is skew symmetric. Further 


(16) X,U, U' Xr 
and, since X,? = + E,, 
(17) Winey — 


In type (a), when p30, any matrix D commutative with Aj;B;" is of 


0) 
the form ), where 


0 Dos 
r-1 r-1 
(18) > f.U-* and Deo > 
k=0 k=0 


If p = 0, the matrices defined by (18) are certainly commutative with A,B; 
but of course are not the only ones. 
In type (a,) D has the same form except that f, and gx are both poly- 


1 John Williamson, loc. cit., page 155, formula (55). Formula (55) is of course 
simplified for this special case as indicated on page 162. The fact that B, is not unique 
but may be replaced by — B, does not alter the form of a matrix commutative with B,. 


886 JOHN WILLIAMSON. 


nomials in p, i.e., are of the form ( : <} In type (b), D has the form 
D,, in (18), where f;, is again a polynomial ** in p. 


2. We now consider the purely algebraic problem of determining the 
number m of lineary independent symmetric matrices F; of order 2n, which 
satisfy 


(19) F.GH = HGF;,, == 1,2,---,m), 


90 
singular skew symmetric matrix mentioned in the introduction. If (19) is 


where /I is a given symmetric matrix and a—( + is the non- 


satisfied, 

F (GHG = HGF;,G, 
or, since G = — G", 
(20) = 


Hence, if F; satisfies (19), FiG* is commutative with HG. Since the 
number and nature of the linearly independent matrices /’;G~’, commutative 
with HG-', are known,"* it is only necessary to determine for which of these 
known matrices F;G~ the matrix Fj is symmetric. 

The number of linearly independent matrices commutative with HG 
depends on the number and the nature of the elementary divisors of the matrix 
pencil /{—2G. Hence, in considering the general case, it is necessary to 
reduce HT and G to the normal forms given in section 1. However, if HG” 
is not derogatory, i.e., if the minimal equation of HG is the same as its 
characteristic equation, any matrix commutative with HG-' is a polynomial ™ 
in HG, A maximal set of linearly independent matrices commutative with 
HG" therefore contains exactly 2n members; and one such set consists of the 
2n distinct powers of HG", i. e., of the 2n matrices 


(k =0,1,2,---,2n+1). 
We may therefore suppose that 


= (HG")* 


12, J, H. M. Wedderburn, “ Lectures on matrices,” American Mathematical Society 
Colloquium Publications, vol. 17 (1934), page 124; John Williamson, “The idempotent 
and nilpotent elements of a matrix,” American Journal of Mathematics, vol. 58 (1936), 
p. 477. 

18, J. H. M. Wedderburn, op. cit., page 105. 

14 J, H. M. Wedderburn, op. cit., page 27; C. C. MacDuffee, The Theory of Matrices, 
Berlin (1933), page 94. 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 887 
or that 
= 
Consequently, F’; the transposed of F; satisfies 
= (— = (— (AG) = (— 1) Fj. 
If Fy is symmetric, 1— 1 must be even and 7 must be odd. Therefore, if HG"? 
is not derogatory, there exist exactly n linearly independent symmetric matrices 
F;, which satisfy (19). One such set consists of the n matrices '° 
(21) F, = (AG")? 1,2,---,m). 
Since the matrices F;G-', where F; is defined in (21), are all polynomials in 
? 
HG-', it follows that 
FGF; = (4,7 == 
and hence, that the nm quadratic integrals corresponding to the matrices Fj 
form a set of integrals in involution. Consequently, we have 
THeoreM I. Jf HG" is not derogatory, there exist n linearly inde- 
pendent quadratic integrals of the system (6). These n quadratic integrals 
form a set in involution and may be so chosen that the corresponding matrices 
are the matrices Fi, in (21). 
It will be shown later (§ 3) that these n quadratic integrals are not only 
linearly independent but also functionally independent. 
If a matrix F’, which satisfies (8), is not symmetric, we find, on taking 
the transposed of both sides of (8), that 
= 
or, since H is symmetric and G skew symmetric, that 
= HGF’. 
Therefore we have 
Lemma 1. Jf a matrix F satisfies (8), so does the matrix F’, the trans- 
posed of 
If the pencil of matrices JJ —aG@ is congruent to the pencil A — 2B, 
so that there exists a non-singular matrix P satisfying both of the equations, 


PHP’=A and PGP’ =B, 


8 Since G-1 = — @ the matrix F; = (H@)2(i-1)H. The matrix G-! is used instead 
of —G@ to emphasize the fact, that it is the matrix pencil 1 —#G which is the 
dominating factor throughout this discussion. 


888 JOHN WILLIAMSON. 


then 
AB"Q,B" = Q,B° AB", 
where Q; = 

Consequently, we may replace the matrices H/ and G in (19) or (20) by 
any pair A, B equivalent to H, G or, what is equivalent to this, we may suppose 
that the pair H, G is already in the normal form described in § 1. We there- 
fore do this and assume that H and G are the matrices A and B respectively 
of §1. We let 

P= (Fij;), (i,j =1,2,---,k), 


be a partition of F similar to that of A or B and therefore, as a consequence 


of (20), have 


If FG" is the most general matrix commutative with HG, by Lemma 1, 
we obtain the most general symmetric matrix satisfying (8) from this by 
putting Pj; = F’ij;, i<j, and restricting Fi; to be symmetric. If A,B, is 
non-derogatory, it is a consequence of Theorem 1, that the number of linearly 
independent matrices commutative with for which Fi; is sym- 
metric, is exactly one half the order of Aj, i.e. is one half the number of 
linearly independent matrices commutative with A;B;*. Consequently, if 
each of the matrices A;B;' is non-derogatory, the number of linearly in- 
dependent symmetric matrices F’, which satisfy (8), is exactly one half the 
total number of linearly independent matrices commutative with HG and 
as remarked earlier, this number is known.’* A maximal set of symmetric 
matrices F;, which satisfy (9), must consist entirely of diagonal block 
matrices 
** » ; 

and, as a consequence of Theorem 1, the number of linearly independent 
matrices in such a set is n. It is apparent from § 1, that A;B;~! is derogatory, 
if, and only if, the pencil Aj — 7B; is of type (a) with p= 0, and that then 
H is singular. Accordingly, we have proved 

THEOREM 2. If H is non-singular, there exist n linearly independent 
quadratic integrals of the system (6), which form a set in involution. The 
number of linearly independent quadratic. integrals of the system (6) 1s 
exactly one half the number of linearly independent matrices commutative 
with HG". 


If A;B;* is derogatory, the elementary divisors of the pencil A; — xB; 


16 J. H. M. Wedderburn, op. cit., page 105. 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 889 


are a’, 2” where r is odd. On dropping the suffix r, we obtain from equa- 
tion (10) 


U 0 
(23) A,;B;" = 


If 
Pj; = DB;", 


as a consequence of (22), we have 
(24) D[U,— U’] = [U,— UD. 


Finally, if D = (Di;), i, 7 =1, 2, is a partition of D similar to that of A;B;" 
in (23), then (24) vields the equations 


(25) D,,U D..U’ U'D22, | D,,U’ UD», D.,U 
The matrix F';; therefore has the form 

Dx, De» 

— Dy, Dy» 
and is symmetric, if, and only if, Do; and Dy, are symmetric and D,, = — D2». 
sy (16), U’ = — XUX" and therefore by (25), Di2X¥UX-! = UD,2; so that 
D,.X is commutative with U and is therefore a polynomial in U. Hence 
poly 
i=0 


Since r is odd, X is symmetric and therefore, 


r-1 
= SFU! =D (—1) f 


If Dy. is symmetric, fj = 0 when 7 is odd and therefore, if r= 2m -+ 1, 
D,, depends on the m 1 parameters fi, i= 0, 2,4,- -,2m. Similarly, if 
D.; is symmetric, )., depends on m +1 parameters. Finally, if D,, is the 
most general matrix commutative with U, then D,, depends on r= 2m + 1 
parameters and D’,, is the most general matrix commutative with U’. Hence, 
if D’;,; = — Do, the matrix pair D,, and D.. depends on only 2m -+ 1 
parameters. Therefore the matrix /’;;, when it is symmetric, depends on 


m+1l+m+1+2m+1—4m+3 


parameters. The general matrix F';;B;-!, commutative with A;B;*, however, 
depends on 4r = 8m + 4 parameters and 4m + 3 = 4(4r) + 1. For example, 
in the simplest case, 7 = 1, A; is the zero matrix and Fj; is of course arbitrary. 


To restrict Fj; to be symmetric imposes only one condition and there are, 


13 


? 
y 
y 
f 
f 
] 
lig 
t 
t 
j 


890 JOHN WILLIAMSON. 


therefore, in this simple case 3=44-+1 linearly independent symmetric 
matrices Fj;. Consequently, if A;B;* is derogatory, the number of linearly 
independent symmetric matrices Fj; is one more than half the number of 
linearly independent matrices commutative with A;B;*. Moreover, the 
matrices F;;, for which D,. = 0, form a set in involution, since any two such 
matrices /';;B;-* are obviously commutative. Their number is r= 2m -+ 1 


and we therefore have 


THEOREM 3. The number of linearly independent symmetric matrices 
F’;, which satisfy (19), is exactly one half the number of linearly independent 
matrices commutative with HG-', unless the pencil H—2xG has a pair of 
elementary divisors of the form (a?"*,2°"*1), For each such pair of ele- 
mentary divisors, the number of linearly independent symmetric matrices F; 
is increased by one. 

It is obvious that two matrices FG for which Fi; = 0, i ~ 7, are always 
commutative and it is known that a maximal set of matrices FG-', commutative 
in pairs, consists solely of matrices ‘* for which Fj; = 0,1 j. The number 
of linearly independent symmetric matrices /’ in such a set is, therefore, 
d—Sd, where d; is the number of linearly independent matrices Fiji. 


i=1 


Since we have just shown that dj; = 4n;, where nj; is the order of Fii, 
k 

d= = }2n =n. We have accordingly proved 
4=1 


THEOREM 4. Lvery linear conservative dynamical system with n degrees 
of freedom has at least n linearly independent quadratic integrals in involution. 


8. While the results obtained up to the present all deal with linear in- 
dependence, we now determine, from the known linearly independent quadratic 
integrals or forms, all the functionally independent quadratic integrals. We 
first show that quadratic integrals in involution, which are linearly in- 
dependent, are necessarily functionally independent. 

We note that, if 

[ Fis, Foe, ° Fix | 
is a diagonal block symmetric matrix, the quadratic forms corresponding to 


the matrices 


are not only linearly, but also functionally independent, since each involves 


17 J. H. M. Wedderburn, op. cit., page 106. 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 891 
a different set of variables. Accordingly, we need only consider four simple 
cases—those corresponding to the types (a), (a1), (b) and (b,) of section 1. 


Type (a). Let H —2zG have the single pair of real elementary divisors 
(A+ p)". Then we may take H in the normal form given by (10), 


« 


where LZ is the matrix L; of (11). Then the n linearly independent matrices 


F of Theorem 4 may be chosen to be *% 


0 U* 
26 
(26) 
and those of Theorem 1 as 
QO 
(27) 0 = 0, 1, 2, »n—1) 


The quadratic forms corresponding to the symmetric matrices (26) are 


The n quadratic forms (28) obviously are functionally independent, since each 
of the forms f,, fo,: - -,fn contains a variable, which does not occur in any 
of its predecessors. Since the matrices (26) are all linear combinations of 
the matrices (27), the m quadratic forms corresponding to the matrices (27) 
are also functionally independent. It should be noted that the above results 
are true, even if p 0; in this case, however, there do exist in addition other 
symmetric matrices F', which satisfy (8) but are not of the above form. 
Type (a,). Let H—2zG have only the four elementary divisors 
(c+a+ ib)", b 0, so that H is now of order 4n. We may take H and G 
in the normal form described in section 1. Then the 2n linearly independent 
matrices /’ of Theorem 1 are given by (27), with k =0,1,2,: - -,2n—1, 
while the matrices of Theorem 4 are the matrices (26) together with the 


matrices 


0 ive 01 


For convenience we relabel the 4n variables in the order 2, 2, €2,°** €2n. 
Then the 2n linearly independent quadratic forms corresponding to the 


matrices (26) and (29) are respectively 


18 J. H. M. Wedderburn, op. cit., page 104. 


: 
a 
| 
| 


892 JOHN WILLIAMSON. 


j 
(30) 2fj = 2 2 + (j= 1,2,---,m), 
p= 
and 
j 
(31) = 2 2 (LpE2n-jsp — Ep2n-jp)- 
p= 


On arranging these 2n quadratic forms in the order, 
fis 91> fe, sfns Yn; 


we see that they are functionally independent; for f; and g, are functionally 
independent and each pair fx, gx contains two variables which do not occur 
in any of the preceding pairs. 


Type (b,). Let the pencil 1/1 —.rG have the single pair of elementary 
divisors (a + 1b)",n 0. Then we may take H/ and G@ in the normal forms 
given by (12), (13) and (14), (15) respectively. On dropping the suffix r, 
we can easily show that any matrix F’ satisfying (8) is of the form D,,X, 
where J);, is given by (18), and the f;, are two-rowed matrices of the form 
cS >. If F is to be symmetric, d= 0 when n—k is odd, while c = 0 


when »—k is even. If the 2n variables x are relabelled in the order, 
Vo, Lny Ex, the n linearly independent quadratic forms of Theorem 


4 may be taken as those n of the 2 quadratic forms, 


j 
(32) fi=d (— 1) (tpt + j= 1, 2,° 
p= 
and 
j . 
(33) (— 1)?" (ap§j41-p — EpVjs1-p), (j = 1, 
p=1 


which do not vanish identically. In (32), fj is zero, if j is even; similarly, 
in (33), gj; is zero, if j is odd. If we write the forms in the order 
fi, 92, fs, 9s," * * it follows, as in the previous cases, that those n of the forms 
(32) and (33), which are not identically zero, are functionally independent. 


Type (62). Let H —zG@ have the single elementary divisor 2°", Then, 
with the notation of §1 type (b.) we may take G=NX and H=U. The 
matrices of the n linearly independent quadratic forms are then U*‘X", 
t= 0,1,2,---,n-—1, by Theorem 1. The corresponding quadratic forms 
are proportional to 


j 
(34) fp =D (— 1) (j =1,3,5,- -,2n—1). 


p=1 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS, 893 


As in previous cases, the n quadratic forms (34) are functionally independent. 


Combining the separate results of this section we have 


THEOREM 5. Lvery linear conservative dynamical system with n degrees 
of freedom has at least n functionally independent quadratic integrals in in- 
volution. If HG is not derogatory, the quadratic forms, whose matrices are 
7 — 1, 2, are n functionally independent quadratic in- 


tegrals in involution of the linear system (6). 


4, In determining the maximal number of functionally independent 
quadratic integrals (not necessarily in involution), we shall once again start 
from the known set of linearly independent ones. As in the previous section,’® 


it will only be necessary to consider four special cases: 
Type (a). Let H —2G@ have the elementary divisors 


(x— + 
p real and different from zero. Then we may take H and G@ in the normal 
form of § 1, where A; and B; are given by (10) and (11) for all values of 7. 


Therefore, if F is a symmetric matrix which satisfies (8), 


r= 0 ). where W = (Wj;), (4,9 == 1,2,- -,&) 
and 
(35) Wij -( Cj = (0, Gij;), ey =: Cj. 
The matrix G;; is of the form 7° 
(36) =D GiisV®, 


where e is the minimum of e; and e;, and U is the auxiliary unit matrix of 
order e. If all gaye are zero except a particular one, gijs, Which has the value 
unity, we denote the corresponding matrix F by Fije-s and the corresponding 
quadratic form by 2fije-s. With this notation the linearly independent quad- 


ratic forms are 


t 

(37) fist = (i, 7 = 1, 2,° 
p=1 

where r =e, +e, The quadratic 


19Tf AB=BA and A is the diagonal block matrix [A,,4,] where the minimal 
equations of A, and A, are relatively prime, B is also a diagonal block matrix par- 
titioned similarly to A. 

20 J. H. M. Wedderburn, op. cit., page 104. 


| 


894 JOHN WILLIAMSON. 


forms (37) are actually of the same nature as those of (28), except that the 
sets of variables involved are no longer 2, and * *, In 
particular, if = y; and = 2), 


(38) fis 1 = = 


In order tu determine the functionally independent quadratic integrals, 
we make use of linear differential operators reminiscent of the Aronhold opera- 
tors of classical invariant theory.” We define the linear differential operators 
Q; and 0; by 


Ci 0 ei-1 0 
39 = Ursp+1 (es — ’ 
(39) i =P + ( 
e-1, and 


0 ey-1 0 
p=1 p=9 Ln+p+p+1 


e.,. Then 


t 
p=1 
and 


t 
= (¢ + 1) 
p=1 
since g=p-+e;. Therefore 


t 
(Q; + 95) = (p—1) + = (t — p + 1) 
p=2 p=1 


t+1 
== 


and finally by (37) ” 


(40) + 05) fije = 
We note that the 2n —e, quadratic forms 
(41) fiit,s (tj = 


are functionally independent; for, if we arrange them in the order 


each form contains a variable which does not occur in any of its predecessors. 
These variables are in order 


21 Cf. L. E. Dickson, Modern Algebraic Theories, pp. 25-27. 


( 
( 
( 
( 
1 
V 
0 
ee 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 895 


Next we show that every quadratic form (37) is a function of the quad- 
ratic forms (41). It is a consequence of (38) that 2;/2j.. = fiji/fjju1 and 
therefore that 
(42) = 


where qi; is a rational function of the quadratic forms fs; and fss411 Where 
(43) either j Ss Si—1 oriSsSj—l. 


Consequently we have 
(44) fii = Qisfiir 
and are in a position to prove, 
LemMA 2. The quadratic form fijt is a rational function of the quadratic 
forms fesr, fessir,; Where rt, and s satisfies one of the inequalities (43). 


We shall prove this lemma by induction on ¢ and observe that, as a con- 
sequence of (44), it is true when £1. We assume the lemma true for the 


value ¢ and therefore have 
(45) R= R(feors 1 =t, s defined by (43). 


Let 2 = SQ, where Q, is defined in a similar manner to Q; in (39) and 
the summation extends from i to 7 or from j to 1. Since, by definition, Q is a 
linear operator, QF is a sum of terms, each term being a product of the partial 
derivative of R with respect to one of its variables fepr by Ofspr. By (40) 


Ofije = and Ofspr = 


and therefore by operating with 2 on both sides of equation (45) we have 
tfijts: = W, where W is a rational function of fer and fessir; $ is defined by 
(43) and r=¢-+ 1. Hence our lemma is proved and consequently all the 
quadratic forms (37) are functions of the 2n—e, functionally independent 
quadratic forms (41). It should be noted, for later reference, that e, is the 
nighest exponent of (z + a) in the elementary divisors of the pencil H — 2G 
or of HG-1 — cE, and that accordingly the minimal equation of HG is of 


degree 2e,. 
Type (a,). Let H — 2G have the elementary divisors 


Then, if we let G be of order 4n and relabel the variables 21, £2, €2,° * Lony Ean 
we find, by an argument similar to that applied to the forms (30) and (31) 
of the previous section, that the forms 


| 

be 

> 


896 JOHN WILLIAMSON, 


t 
(46) = (LrspTni0-tep + 
p=1 
and 
t 
p=1 


are linearly independent if ¢, o and 7 have the values given in (37). In 
particular we may write 
(48) Wijr = + and = — 
The 4n — 2e, quadratic forms (46) and (47), for which 7 =7 or 71+ 1, are 
functionally independent, for they can be so arranged in pairs that each pair 
contains two variables which do not occur in any of the preceding pairs. 

On forming the Jacobian of the eight forms 


(49) Wikiy — Vikiy Wijiy — Viji, Wikiy — Vikiy Wij1y — Vijr, 
we obtain, with the notation of (48), the eight-rowed square matrix 


(7,0 Y; 0 
4; 0 


Zk bi Yi Hi 
0 4, Y; 0 fe 
0 Zj 0 Y; 


the rank of K is the same as that of the matrix P obtained from K by replacing 
Y; by W; and Y; by Wj. Since the matrices W; and Z; are commutative. 
a simple calculation shows that the product of the two by four matrix 


(—Y¥jZ;, — Viti) 


by P is the zero matrix. Hence, at most six of the eight forms (49) are 
functionally independent and, since any three of the pairs wan, Van in (49) 
are functionally independent, each member of-the fourth pair is a function 
of the other six forms. In particular 


(50) Wiki = F(Wijr, Vijrs Wir, Vier, Wij, Viir)> 
and, if k 
(51) = G(Wij,15 Vijis Wijr, 


with similar results for the corresponding forms vx, and Vj41. 
As a consequence of (51), if a > b, wan is a function of forms w;;, and 
Vij, Where i<j. As a consequence of (50), when b—a> 2, wan is 4 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS, 897 
function of forms w;;,; and vj4;, whefe 7 —i<b—a. The same is true of 
the forms v,,; and therefore we have 
(52) = F (Wys1, Vss1 We,841,15 Us,s41,1 vin = G (Wss1; Vss15 Ws,s41,19 Us,s+1,1 


Let O; and Oj; be the differential operators obtained from Q; and Qj, 
defined in (39), by replacing # by €& Then 


(53) (2; + + + wijt = 
and 
(54) (Q; + O; + QO; + vigt = 


If O = 0), where the summation extends from i to 7 or j to i, we may 
prove a lemma, which is the analogue of Lemma 2. In the latter the operator 
Q is replaced by Q+ O, (44) by (52), and (40) by (53) or by (54). It 
then follows that any of the quadratic forms (46) or (47) is a function of the 
4n — 2e, quadratic forms for which 7 =i or 1+ 1. Therefore, in this par- 
ticular case, where G is of order 4n and the minimal equation of J/G-! is of 
degree 4e;, the number of functionally independent quadratic integrals of the 
system (6) is 4n — 2e,. 

Type (b,). Let the pencil — 2G have the elementary divisors 


Then we may take H and G@ in the normal form of § 1, where A; and B; are 
given by equations (12), (13), and (14). If F is a symmetric matrix, which 
satisfies (8), and, if PW = (Fi;), 1,7 =1,2,° ++, is a partition of F similar 
to that of // or G, 

Bij = (Wis) Xj, 


where Wj; is defined by (35) and (36), with the addition that 


-( 
—lijs 


Since F is symmetric, #’;; = /4;, so that we need only concern ourselves with 


Jiis 


the case in whichi= 7. If Fy, = 0 for all a and b except when b=] 
and a= j, i=b, the corresponding quadratic form involves two sets of 
variables, one containing 2e;, and the other 2e; variables. If for convenience 
of notation we write e; =e and e; =d, we can denote these sets by 

respectively. The corresponding linearly independent quadratic forms are then 


t 
(55) Wijt = D (—1)?** + ({=1,2,---,d), 


; | 
| 


898 JOHN WILLIAMSON. 


t 
(56) =D (— 1)?*? — EpYts1-p) ({=—1, * 
p=1 


If ij, the variables x, € are the same as y, 7; and wijt =O, if ¢ is even, 
while v4; = 0, when ¢ is odd. Let 


Then 
t 
= 2 (—- 1)?**9( ts1-p Lps1Nt+1-p) 
p= 
t+1 
p= 
while 
t 
2 1)?**(¢ + i—_ (Xpyts2-p EpYts2-p)+ 
Therefore 


t+1 

(57) (Q, + = td (— 1)?** — EpYt+2-p) = Wijtu. 
p=1 

Similarly 


t 
= (-— 1)?**9 (Epsintsi-p + LpsYts1-p) 
p=1 


and 
t 
= (— 1)?*(¢ + 1 — p) (— 2pYt+2-p — Epntse-p)» 
p= 
so that 
(58) + Oy) = — 


The Jacobian of the four forms wisi, Wjj:, Wij, aNd Vij, is the matrix 


0 0 
0 0 Yi 
1 


The determinant of this matrix is zero while the matrix of its first three rows 
has rank three. Therefore 2;;, is a function of wii, wjj, and wij,. Since 
Q2 + Q, is a linear operator, we may employ an argument similar to that 
used in the proof of Lemma 2, and from (57) and (58) deduce that wijz is a 
function of Wii1, Wij1, Vijoe By repeating this argument we 
finally have the result that, for any value of k, wijox and Vijox,, are both func- 
tions of forms of the type fase, where 


(59) fave = Wave, if c is odd, and fave = Vave, if ¢ is even. 


and 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 899 


If we denote the variables associated with F,, by z and ¢, the Jacobian 
matrix of the six forms 


(60) fits, fori, firs, 
formed with respect to the variables 7, 41, 21, €1, m1, €1, is the matrix 


fe 0 60 & 0 0) 
0 0 0 0) 
00200 
a= Yi 0 0 
(210476 0 &) 


If X is the column vector with components &, m, £1, —%1, — 91, — %1, the 
vector Q.X is the zero vector. Hence Q is singular. The first five of the forms 
(60) are functionally independent and therefore fi7; is a function of these 
five. Therefore, if i+ 1 <r, 
(61) firs = F (favs); 
where 6 —a < r—1. 

As a consequence of (61) we have 


(62) firr = G (far), 


where b =a ora+1. By operating on both sides of (62) with an operator, 
which is the sum of all operators Q, and Q, for all sets of variables, it follows 
that, for all values of 7 and j, the form fij: = F' (favs), where 6 =a or a+ 1, 


s=1,2,---,e. Hence, the quadratic forms wij: of (55) and (56) are 
functions of the forms 
(63) fiit, (ime fjaiori+1, 


where fij: is defined by (59). There are 2n—e, forms (63) and they are 
functionally independent, as they may be so arranged that each involves a 
variable which does not appear in any of its predecessors. Once again, e, is 
one half the degree of the minimal equation of HG". 

If H is non-singular, we can take H and G in diagonal block form 
H G=([G,, --,G:], where H; and G, are of 
order 2n; and the elementary divisors of H;—xG; are of one of the three 


types considered above. The number of functionally independent quadratic 
t 


forms F is therefore > (2n;—2mi/2), where 2m; is the degree of the 


i=1 
t t 
minimal equation of HiG;.. But Sinj—n, and > 2m; = 2m is the degree 
4=1 i=1 
of the minimal equation of HG@-'. Since the remaining case, in which H is 


| 


900 JOHN WILLIAMSON. 


singular, is rather complicated, it is advisable to sum up our results in the 


form of 


THEOREM 6. Let H be the matrix of the Hamiltonian function of a linear 
conservative dynamical system with n degrees of freedom. If the degree of 
the minimal equation of IIG-* is 2m, the system has 2n—m independent 


quadratic integrals unless H is singular. 


5. In order to obtain corresponding results for the case of a singular //, 
it is only necessary to consider the case in which every elementary divisor of 
/1 — xG@ is of the form x7. We therefore consider the pencil Hf — 2G, in which 
the elementary divisors are 


When + is odd, the elementary divisor #7 must occur an even number of times; 
hence, if e; is odd, either e; has the same vaiue as e;_; or the same value as 
éj... From §1 we see that a normal form for HG is the diagonal block 
matrix 


where W; —U;,, if e; is even, while, if e; is odd, either W;—U; and 
=—U’;, or = Ui and W; The normal form for is 
also a diagonal block matrix with a block Y; (defined by (15)) corresponding 


0 
to each even e; and a block ( : 0 corresponding to each pair of odd ¢;. 
4i 
In the above the matrices, 4), U; and X; are the matrices of order e; de- 
fined in § 1. 

In order to determine the form of the most general symmetric matrix F, 


which satisfies (8), we let 


(64) == FG". 
Since C is commutative with HG", if C = (Ci;), =1,2,° +, k, we have 
(65) = Cig Wj: 


The matrices Cj; are of four types depending on the structure of W; and W). 
However, as we shall only be interested in symmetric matrices F, we need only 
consider the different possibilities for Ci; when iS j, so that e; = ej. There- 
fore, in what follows, 7 is always less than or equal to 7. These possibilities are: 


Type Wi;=Ui, Wj; Then = (4), where Gi; is 


polynomial in U;. 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 901 


Type (ii). Wi=—U’i;, Wj; =—U’;. Then where Kj; 
ij 
is a polynomial in V/’;. 
Type (iii), Wj; =—U’;. Since U’;X; =—AjUj, (65) 
becomes 


and 


Gi; \. 
where ( “- is of type (i). 
Type (iv W; W; === U;. Since U ;X; = — X;U’;, (65) 


becomes 
U = ( js 


X;, where s of 
ij Y;, where is of type (1) 


Symbolically we may denote a matrix Ci; of type (p) by Ty, p= 1, 2, 3, 4 and 


so that 


therefore symbolically have the result 
(66) T,;=T,X and 


If F = (Fj;) is a partition of the matrix F’, defined by (64). similar to 
that of C, the matrix Fj; is one of four distinct types. 

If W,; =U; and is even, Fi; = Since is of type (i), by 
(66), Fix is of type (iii). 


If W; = U; and ¢; is odd, Fi; = — Ci. Since e; is odd, Wi,, = — VU’; 
and (;,;,, and, therefore, Fi; is of type (ili). 
lf Wi Since = Ui, the matrix Cj;_, and, 


therefore I’;;, is of type (iv). 
We accordingly have the lemma, 


Lemma 3. The matrix Fj; is either of type (iii) or of type (iv). It ts 
of type (iv) if, and only if, Wi =—U’;. If Fix ts of type (iv), the matrices 
and are both of type (iii). 


ej even: Then If Wi = Ui, Ci; is of (1) and, by (66), 
Fi; is of type (iii). If Wi =—U%i, Ci; is of type (iv) and, by (66), Fi; is 
of type (ii). On replacing the matrices of (’ PF ‘) by their corresponding 
ij 


types, we may express the above results conveniently by the two diagrams 


e : 
r 
f 
0) 
i 
| 


902 JOHN WILLIAMSON. 


Ts T, T2 
(67) ( and ( 

ej odd, W; = Then Fi; — Ci jas. If Wi = Ui, Fi; is of type (iii) 
and, if W; =— Ui, Fi; is of type (ii). These two results lead to the same 
diagrams (67). 

e; odd, W; =— U’;: Then Fi; = Cij... If Wi = Ui, Fi; is of type (i) 
and, if W; = — U’;, Fi; is of type (iv). These results may be expressed by 
means of the two new diagrams 

(68) ( and ( 


It is apparent from the diagrams (67) and (68) that the type of i; 
uniquely determines the types of Fi; and Fj;, and conversely. Since 7’; and 
T, involve the matrix X, while 7, and 7, do not, we shall call the matrices of 
types (i) and (ii) positive, and those of types (iii) and (iv) negatwe. For 
brevity we shall say that Fi; has sign e, where «= -+ 1 or —1, according as 
Fi; is of positive or negative type. 

t+ 1 less than 7: If Fi; is positive, either Fi; is of type (iv) and Ij; of 
type (ili) or Fj; is of type (iv) and Fi; of type (iii). In the first of these 
cases, as a consequence of Lemma 3, Fis; 141 is of type (iii), so that Fj i+: is of 
type (ii) and F;,, ; is of type (iii). In the second case, F’j-; j-, is of type (iii), 
Fj ;-, of lype (iii) and Fj_,; of type (i). We have therefore proved 

Lemma 4. If i+1< j and if Fi; is positive, there exists an integer k, 
t<k <j, such that Fix, and Fy; are of opposite sign. 

On the other hand, if Fi; is negative, 4; and Fj; are of the same type. 
Therefore, if i<ck <j, Fu, and Fy; are both of the same sign. We may 
combine this last result with that of Lemma 4 to have 

Lemma 5. Let Fi; have the signe= +1 andleti+1<j. Then there 
exists an integer k,i<k < j, such that, if the sign of Fix is 8, the sign of 
F;,; is — 

We next obtain explicit formulae for the linearly independent quadratic 
forms. If Fi; is of type (iii), Fit = GiiXi, where Gi; is a polynomial in Uj. 
Since F, and therefore Fii, is symmetric, Fj; is an even or an odd polynomial 
in Uj, according as e; is even or odd. Let 

Fiat = 
Then, if e; =e, and is the vector with components 2, , Ze, 


itt = Giit, 


4 
i 
\ 
e 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 903 


where 
t 

(69) git = (— 1) aptts1-p, (¢=—1, *,@). 
p= 


It is obvious from (69) that giit is zero, if ¢ is even, so that there are only 
[3(e + 1)] linearly independent quadratic forms giit, for a fixed value of 1. 
Similarly, if Fi; is of type (4), Fiie = (Uie*)’X; and a’ = hiit, where 


t 
(70) hit= = (— 


p=1 
On comparing (69) and (70) we see that hiit is of the same form as gist, if 
a; is replaced by 
Since F’ is symmetric, the linearly independent quadratic forms corre- 
sponding to the matrix Fi; are those whose matrices are of the form 


. If ej; =e and e;=d and and y are vectors of dimensions 
ij 


e and d respectively, the corresponding quadratic form is 


0 Fi; x 
71) ( ) = 2a’ 


Since, according to assumption, e=d, if Fi; is of type (iii), the linearly 
independent quadratic forms obtained from (71) are 
(72) =X (— 1)" 
p=1 
If Fi; is of type (iv) they are 
t 
(73) higt =X (— 1)? (t= 1,2,---,d). 
p=1 


If ’;; is of type (i) they are 


t 
(74) Uist = L CpYasp-t, (¢=1,2,---,d), 
p=1 
and, if Fi; is of type (ii), they are 
t 
(75) = Lesp-tYp; ({=—1, 


p=1 
Although at first sight it appears that there are four distinct types, there are 
really only two. If, in (73) and (74), we replace yj; by yas-; and in (73) and 
(75), by the forms becomes the same as gijt while wije and 
both become 

t 
(76) Vijt = (¢=1, 2,°°°,@). 


p=1 


; 


904 JOHN WILLIAMSON. 


The 2n original variables may be placed in sets of @2,- , elements 
corresponding respectively to the symmetric matrices Fa. 


when Fj; is of type (iv), the e; variables associated with Fy; are relabelled 
in the reverse order, this relabelling is equivalent to the replacement of y; by 
Yds1-j and of xj by Xey1-j;, the replacement by which (76) was obtained. All 
such replacements may be made simultaneously and, if this is done, the linearly 
independent quadratic forms corresponding to the matrices Fi;, 1S j, 
i,7 =1,2,:--.,k are all of two types; the forms gijt of (72) and forms vijt 
of (76). It is of course apparent that, if «=, the formula (72) reduces 
to (69). In conformity with our previous convention, the forms gijr are 
called negative, and the vijt positive. For convenience we shall denote these 
forms by fij;t, so that 

(77) either fist = Vijt OF = gJijt- 


Further, when there is no risk of confusion, we shall drop the suffixes i and ) 
and write vr, ge and f; for vist, gije and fije respectively. 
We now define two cther linear differential operators, X and H, by 


a 
i=1 i= 
Then 
t t 
(X + cH) ve = + 
p=1 p=1 
t t 
p=1 
t t-1 
» (— € (— 1) — P) Yts1-p 
p=1 
t 
> (—- 1) ts1-p 


=t > (—1) a = tess 
p=1 
and therefore 
(79) (xX + (—1)*H) ve = tts. 
Further, 


t t 
(X + eH) ge =X (—1) + €(— 1) (— 1) 
p= p=1 


1 


t t 
=> (— 1) + €(—1)** (— 1) py pr 
p=1 p=1 
t t 
=. 
Lif (—1), 


t 
= — = — Wes 
p=0 


| 
| 
{ 
j 
| 
| 
| 
H 
{ 
q 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 905 


and therefore 


(80) (X + (—1)*"H) ge = — toes. 
Since =v; =— gi, 
(X—H) = ge, 
(X — H)*(t1y1) = (X— H) = — 2vs, 
(X — H)* = — 2(X —H)v, = — 3! gu, 


and in general 

(81) (X — = (—1)*(28 + 1) ! geese, 
(X — H)?8(x1y1) = (—1)# (2s) ! 

Similarly 

(82) (X + = (—1)*(2s + 1) ! 
(X + H)*(ays) = (— gore 


It was remarked earlier that the forms fij: of (77) are of two types, 
positive and negative. However, for a fixed i and j and variable ¢ all fij: are 
of the same type, the type being determined by the sign of Fi;. 

Since v; is positive and gt is negative, we may write (81) and (82) more 
compactly in the form 


(83) (X — cH) = afesse, (X + = 


where —e is the sign of f; and a and 6 are numerical constants. If 1= j), 
fiit is the giit, which is defined by (69); and, on dropping the suffix 1, we 
have, as the analogue of (83), 


where a is a numerical constant. 

We shall now prove that all the quadratic forms fi; are functions of those 
for which 7 =1 or 71+ 1. In so doing we shall say that fije is reducible, 
if it is a function of quadratic forms fave, where either b—a< j—i or 

=j,a=iandc<t. Clearly, fij: = 14 is redycible, since it is a 
function of = fii, and of y,*=fjj,. The reducibility of fij, will be 
pressed by writing . 

fii == 0). 
Thus, if f=h, f —h is reducible. 

In showing that fijt is reducible, when j >1-++ 1, we shall use an in- 
duction proof. The proof consists of two essentially different parts, since the 
cases of even and odd values of ¢ have to be treated separately. In fact, for 
odd ¢ the restriction 7 > 7i-+ 1 may be replaced by the weaker inequality 7 > 1. 


14 


| 
| 
{ 
¥ 


| 
| 
| 
| 


906 JOHN WILLIAMSON, 


We first assume that fijr, 7 > 7% is reducible for {<< 2m and under this 
assumption prove that then fij2n,, is reducible. Since (2,y,) = 
(X +- = (X + 


Therefore by Leibnitz’ Theorem, 


2m 2 : 
(85) ( m) {(X eH) (X + 
r=0 
2m 2m 2\) 2m-r 2 
=2 (X" (2°) }H (y:"). 


If r is even, X"(z,*) and H'")(y,") are both reducible by (84). Moreover, 
if Fi; has the sign —e, and r is even but different from 0 or 2m, 
(X + €H)"(a,y,) and (X + €H)*”"-"(x,y,) are also reducible by (83) and our 
induction assumption. Accordingly, with this value of «, (85) reduces to 


(86) + = V—U, 

where 

and 


r odd 
On writing 
X*(z7,) X* and H*(y,) =H’, 


we have in place of (87) 


r odd a=0 
or 
r 2m-r heath 


rodd a=0 


If U* is obtained from U in (87) by replacing e by —e, since r is odd, as a 
consequence of (83) and our induction assumption, U* is reducible. Hence, 


(90) 


In (88) U is expressed in terms of powers of «. In the difference U — U* 
all even powers will disappear and each odd power will occur with a factor 
two. Therefore, U —-U* is equal to twice the sum of those terms on the 
right of (89) for which a+ 6 is odd. In this summation therefore each 
term has the factor +. For fixed values of a and b, the coefficients of 
2(2m) !eX*X"/alb! is —a)!(2m—r—b)!, where the 


| 
| 
| 
| 
| 
| 
| 
| 
| 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 907 


summation extends over all odd values of r, for which r=a, 2m—r= b. 
If a is odd, r—a is even and, since a+ b is also odd, 2m —7r—b is odd; 
while, if a is even, r-—a is odd and 2m —r—b is even. Hence in 3 each 
term H?’H%/p!q!, for which p+ q = 2m—a—b, occurs exactly once, and 


therefore 
— a)! (2m —r—b)! 
2m —a—l 
> ( ') H?H4/(2m —a—b)! 


p+q=atb 
= (y,y,)/(2m — a — b) ! = /(2m —a— b)!. 
Therefore, on changing the order of summation in the sum for U — U*, 
we have 


U —U* = 2(2m)! (y,7)/(2m —a— db)! 


a+b odd 


2m 
¢ Xa a7) (472) 7. 
a+b odd (, + ( ) 
Therefore, by (90), U=V; while by (86), (X + «H)?"(a,y,) =0. But, 
by (83), (X + = ems. and accordingly fij ons. is reducible. 
Incidentally, we have shown that fij 2m.. is a function of fiis, fijg and fijs, 


where s = 2m. 

We now show that, if i+ 1 <j and if fijt is reducible for ¢ 2m +1, 
then fijon,2 is reducible. Let the sign of Fi; be « and let z be the variable 
associated with the integer / of Lemma 4. Let 6 be the sign of Fix. Then, 
by Lemma 5, — ée is the sign of F,;.. If Z is the differential operator obtained 
from X in (78) by replacing x by z, as a consequence of (83) and our induction 


assumption we have the following results: 


(X eH) (2,41) Of ijotse = 0, if m, 
(91) (X—cH)** = Of =O, if t< m, 
(X + == (X — 8Z)*"(2,2,) 
= (cH — == (eH + 8Z)*" = 0. 
Since (4141) 217 = (4121) (4121), 
(X + eH + 8Z)?™*" (x,y, ) 2,7 = (X + + (4121). 
Therefore, by Leibnitz’? Theorem, if g = 2m +-1, 


g 


{(X + €H)"(a1y1) } (8Z)9" (217) 


r=0 


908 JOHN WILLIAMSON. 


On using the reduction relations (91), we have, as a consequence of this last 
equation, 

+ =V —U, 
where 


rT even 


(7) {X + } (82)9 (2,7) 


and 


(7) {(X + 8Z)" (x21) } (EH + (4:21). 


rT even 


If V* is obtained from V by replacing § by —8, each term of V* is 
reducible (see (91)). Therefore 


On expanding V and V*, we find 


Tr g-r 
— V* = 
where g —a —b is odd, so that a+b is even. On rearranging the order of 
summation, we find 


V—Vt=2 > ee, (z,) 


a+b even a=0 


5 (7) {(X eH) =U. 


Therefore V — U =0 and, accordingly, fij oms2 = 0. 

Since fij,=0, if 17, it now follows that fijs==0 unless 7 =1 or 
j=t+1; and that fii:+=0 when ¢ is odd. Accordingly, all quadratic 
forms fi;: are functions of the forms 


(92) fi-ris; t odd and s even; t,sS 


If e; = 2m; or 2m; —1, the number of forms fiit in (92) is mi. The number 
of forms fi-1is in (92) is mi, if = 2mi, and is mi —1, if = 2m, —1. 
The total number of forms fiit, fi-sis in (92), for a fixed i~ 1, is therefore 
e;. Accordingly, the total number / of forms (92) is 


k 
+ = m, + 2n— 
q=2 
where e, = 2m, or e; = 2m, —1. 
Hence 


2n— m, if = 2m,, and 2n-+1—m, if = 2m,—1. 


| 
| 
| 

| 
| 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 909 


That these 7 forms (92) are actually functionally independent is apparent, 
if they are written in the order 


since each of these forms contains at least one variable which does not appear 
in any of its predecessors. The minimal equation of HG is of degree e,, and, 
therefore, the number / can be expressed in terms of e, and the order of HG". 
By the argument used to prove Theorem 5 we can now complete the proof 
of the theorem, 


THEOREM 7. Let H be the matrix of the Hamiltonian function of a 
linear conservative dynamical system with n degrees of freedom. If e is the 


degree of the minimal equation of HG-', the system has 2n — [5] func- 


tionally independent quadratic integrals. 


6. Linear integrals. If y’x=— S yivi =/ is a linear integral of (6), 
i=1 
condition (2) reduces to 
yGHx=0 or HGy = 0. 


Conversely, if HGy = 0, | is a linear integral of (6). Since G@ is non-singular 
we have the result *°—a linear integral of the system (6) exists if, and only 
if, H is singular. 

If J is a linear integral, J? is a quadratic integral, for 


= 2y’gHzr = 0. 


The number of linearly independent linear integrals is k, where 2n — k is the 
rank of HG or HG-' or H. We may express this by 


THEOREM 7 bis.** The number of linearly independent linear integrals 
is the number k of the elementary divisors of the form a" belonging to the 
pencil H — 2G. 


With the notation of the previous section the linearly independent linear 
integrals are 


V fiir; (4 1,2,---,k). 
There is one and only one integral for each value of i. If e; is even, 


V fits = Lon, o= 6, + lot 4-15 


22This result is not new. See Aurel Wintner, “On the linear conservative 
dynamical systems,” Annali di Matematica pura ed applicata, ser. 4, tomo 13 (1934- 
35), pp. 105-112. 
*8 This theorem is proved by Wintner. See reference **. 


| 

| 

| 

| 

| 


910 JOHN WILLIAMSON. 


if e; is odd but W; = U; (so that e;,, = e;), then 

V fii = Zp. and V = U7, r= p+ 
Since G is now in the normal form of §1 and is therefore a diagonal block 
matrix, and since @o,; is the only linear integral corresponding to its block 
in G, Yo, is in involution with all other linear integrals. The same is true 
of Zp,, and z;. Further, the integrals xp,,; and x; are in involution unless 
ej =1. In this last case, e; = 1, they are not in involution. Hence we have 
the theorem 

THEOREM 5bis. The number of linearly independent integrals in in- 
volulion consists of k —f members, where k is the total number of elementary 
divisors of the pencil H —«xG which have the form at and 2f is the number 
of those for which r= 1. 

As a consequence of the above theorem we have the corollary, 

CoroLLary 1. All linear integrals of the system (6) are in involution 
if, and only if, the pencil H—2xG has no linear elementary divisor of the 
form «—0. 

Further, if there exist f linearly independent pairs of linear integrals 
which are not in involution, the pencil H —zG@ has f pairs of linear ele- 
mentary divisors of the form «—0, «— 0. 

Obviously linear independence of linear forms is synonymous with func- 


tional independence. 


7. In the particular case of the small vibrations about an equilateral 
Lagrangian libration point in the restricted problem of three bodies,** the 


matrix H of the Hamiltonian function is 


(1 0 0 1 ) 
0 1 — 0 V 27 


] —z —5/4 


This matrix can be written more conveniently in the form ( ‘ - where 


*4 Gyldén, Bull. Astr., vol. 1 (1884), pp. 361-369; E. Strémgren, “ Uber die kritische 
Masse im probléme restreint und iiber das probléme restreint im allgemein,” Publika- 
tioner og mindre meddelelser fra Kgbenhavns Observatorium, Nr. 72 (1930). 


j 
| 
| 
i 
| 


THE INVOLUTORY INTEGRALS OF LINEAR DYNAMICAL SYSTEMS. 911 
The characteristic equation of HG is 


(93) at + + w(1—p) 0, 


and the roots of this equation are given by 


(94) 


If the radical in (94) is negative the four roots of (93) are all complex and 
distinct while, if the radical is positive, the four roots, while still distinct, are 
all purely imaginary. When the radical is zero the roots of (94) are all 
imaginary but are equal in pairs. In this, the critical case, the general solution 
contains secular terms.*? Nevertheless, as far as quadratic integrals are con- 
cerned, this case is the same as the other two. For, in the critical case, the 
characteristic equation of HG is (a7 + 4)*=0. Since 


—e—k —2 
(HG ) ( 


the minimal equation of HG! is not «7+ 40. Accordingly, for all values 
of », the minimal equation of HG? is the same as its characteristic equation. 
Therefore, by Theorem 1, there are exactly two independent quadratic integrals: 
the energy integral whose matrix is H and the integral whose matrix is 
(HG-')?H. A simple calculation shows that 


13 V 2% V 2% 5 
— (1— 2 (1—2 
V 27 " V27 
1 — — — 
V 27 2" 
= (1 — 2p) — 1) - ] 0 
27 
) 


THE JOHNS HOPKINS UNIVERSITY. 


25. A. Wintner, “Librationtheorie des restringierten Dreikérperproblems,” Mathe- 
matische Zeitschrift, vol. 32 (1930), pp. 660-661. 


ies 
re 
| 
ot 


ERRATA. 


ERRATA. 


In the joint paper, J. E. Eaton and Oystein Ore: Remarks on multi- 
groups, in the January number of this JouRNAL, pp. 67-71, the following con- 
dition has inadvertently been omitted in the definition of proper homomorphism: 


3. If then for some M2, mz one has myme, M3. 


Mr. R. S. Pote of the University of Illinois has called our attention 
to this fact. 


In the paper, J. E. Eaton, Associative multiplicative systems, in the 
January number of this JOURNAL, pp. 222-232, the following statement 
appearing in § 5, p. 225 is incorrect: 

The X;’s then obviously form an m-system which is homomorphic to the 


original m-system. 


Dr. H. H. Campaigne of the University of Minnesota has‘called attention 


to this error. 


912 


C-27 
UCT 11 1949 


AMERICAN 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


ABRAHAM COHEN F. D. MURNAGHAN 
THE JOHNS HOPKINS UNIVERSITY THE JOHNS HOPKINS UNIVERSITY 


T. H. HILDEBRANDT J. F. RITT 
UNIVERSITY OF MICHIGAN COLUMBIA UNIVERSITY 


R. L. WILDER 
UNIVERSITY OF MICHIGAN 


WITH THE COOPERATION OF 


OYSTEIN ORE E. T. BELL Cc. R. ADAMS 

H. P. ROBERTSON H. B. CURRY R. D. JAMES 

M. H. STONE E. J. MCSHANE SAUNDERS MACLANE 
T. Y¥. THOMAS HANS RADEMACHER GABOR SZEGO 

G. T. WHYBURN OSCAR ZARISKI LEO ZIPPIN 


PUBLISHED UNDER THE JOINT AUSPICES OF 


THE JOHNS HOPKINS UNIVERSITY 
AND 


THE AMERICAN MATHEMATICAL SOCIETY 


Volume LXII, Number 4 
OCTOBER, 1940 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U. S. A. 


gine” 
| 
| 
if 


CONTENTS 


On the non-existence of the Euclidean algorithm in certain quadratic 


number fields. By ALFRED BRAvER, ‘ ‘ 
Postulational bases for the umbral calculus. By E. T. BELL, , » war 
The abelian quasi-group. By Harrier GRIFFIN, 725 
The Gaussian law of errors in the theory of additive number theoretic 
functions. By P. Erpés and M. Kac, . 738 
On the standard deviations of additive arithmetical functions. By 
PHILIP HARTMAN and AuREL WINTNER, . %43 
On the almost periodicity of additive number-theoretical functions. By 
Puitip HARTMAN and AuREL WINTNER, . 753 
On the spherical approach to the normal distribution law. — By Purp 
HARTMAN and AUREL WINTNER, . 759 
On upper limit relations for number theoretical functions. By PHILIP 
Har1MAN and RIicHARD KERSHNER, 780 
On the properties of a collective. By Z. W. BirNBavM and Hersert 8. 
ZUCKERMAN, 
On symmetric Bernoulli convolutions. By Tarsvo Kawata, ‘ - FR 


The four-vertex theorem for spherical curves. By S. B. JACKSON, » 7% 
A complete characterization of sectional families of curves. By ANNETTE 


VASSELL, . . 813 
Exactly (k, 1) transformations on connected linear graphs, By 0. G. 

HARROLD, JR., 823 
The characterization of pseudo- spherical sets. By Leonarp Buv- 

MENTHAL and GrorcE R. THURMAN, 835 


A geometry associated with Cremona’s equations. By Geratp B. Hurr, 855 

Polynomials whose real part is bounded on a given curve in the complex 
plane. By A. C. SCHAEFFER and a. SzEecé, . 868 

Neuer beweis eines Satzes von G. H. Hardy und S. Ramanujan iiber das 
asymptotische Verhalten der Zerfaillungskoeffizienten. Von 


Vosistav G. AVAKUMOVIé, . 877 
An algebraic problem involving the involutory integrals of linear dy- 

namical systems. By JoHN WILLIAMSON, ‘ ° . 881 


THE AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 

The subscription price of the JourNnat for the current volume is $7.50 (foreign 
postage 50 cents); sing'e numbers $2.00. 

A few complete sets of the JouRNAL remain on sale. 

Papers intended for publication in the JouRNAL may be sent to any of the Editors. 

Editorial communications may be sent to Professor F. D. MURNAGHAN at The Johns 
Hopkins University. 

Subscriptions to the JoURNAL and all business communications should be sent to 
THE JOHNS HOPKINS PRESS, BALTIMORE, MARYLAND, U.S. A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at special 
rate of postage provided for in Section 1103, Act of October 3, 1917, Authorized on July 8, 1918. 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H. FURST COMPANY, BALTIMORE, MARYLAND 


PAGE 
ue 


THE JOHNS HOPKINS PRESS * BALTIMORE 


American Journal of Mathematics. Edited by ABRAHAM COHEN, T. H. HILpDEBRANDT, 
F. D. Murnacnan, J. F. Ritr and R. L. WILDER. Quarterly. 8vo. Volume LXII 
in progress. $7.50 per volume. (Foreign postage, fifty cents.) 

American Journal of Philology. Edited by H. Cuerniss, K. Matone, B. D. MERITT, and 
D. M. Rosinson. Quarterly. 8vo. Volume LXI in progress. $5 per volume. 
(Foreign postage twenty-five cents.) 

Biologia Generalis. (International Journal of Biology). Founded by Leorotp LO6n- 
NER, Graz; RAYMOND PEARL, Baltimore, and VLADISLAV RiZiéKa, Prague. It is 
now edited by O. ABEL, L. ADAMETZ, O. Porscu, C. Schwarz, J. VERSLUYS and 
R. WAsiIcky of Vienna, 8vo. Volume XV in progress. 

Bulletin of the History of Medicine. Edited by Henry E. Sicertst. Monthly except 
August and September. Volume VIII in progress. 8vo. Subscription $5 per 
year. (Foreign postage, fifty cents.) 

Bulletin of the Johns Hopkins Hospital. Edited by James Borptey, III. Monthly. 
Volume LXVII in progress. 8vo. Subscription $6 per year. (Foreign postage, 
fifty cents.) 

Comparative Psychology Monographs. Roy M. Dorcus, Managing Editor. 8vo. Vol- 
ume XVI in progress. $5 per volume. 

Hesperia. Edited by WILLIAM KURRELMEYER and KeMp MALone. 8vo. Thirty-two 
numbers have appeared. 

Human Biology: a record of research, RAyMOND Peart, Editor. Quarterly. 8vo. 
Volume XII in progress. $5 per volume. (Foreign postage, thirty-five cents.) 

Johns Hopkins Studies in Romance Literatures and Languages. H.C. LANcastenr, Edi- 
tor. 8vo. Forty-nine numbers have been published. 

Johns Hopkins University Circular, including the President’s Report and Catalogue of 
the School of Medicine. Ten times yearly. 8vo. $1 per year. 

Johns Hopkins University Studies in Archaeology. Davin M. Roginson, Editor. 8vo. 
Twenty-nine volumes have appeared. 

Johns Hopkins University Studies in Education. FrLorence E. BAMpercerR, Editor. 
8vo. Twenty-eight numbers have appeared. 

Johns Hopkins University Studies in Geology. Epwarp B. Matuews, Editor. 8vo. 
Thirteen numbers have been published. 

Johns Hopkins University Studies in Historical and Political Science. Under the direc- 
tion of the Departments of History, Political Economy and Political Science. 
8vo. Volume LVIII in progress. $5 per volume. 

Modern Language Notes. Edited by H. C. Lancaster, W. KURRELMEYER, R. D. HAVENS, 
K. Matone, H. Spencer and C. S. Stncreton. Eight times yearly. 8vo. Volume 
LV in progress. $5 per volume. (Foreign postage, fifty cents.) 

Reprint of Economic Tracts. J. H. Horxanper, Editor. Fifth series in progress. 
Price $4. 

Terrestrial Magnetism and Atmospheric Electricity. Founded by Louis A. BAUER; 
couducted by J. A. Fremine with the coiperation of eminent investigators. 
Quarterly. 8vo. Volume XLV in progress. $3.50 per volume. 

Walter Hines Page School of International Relations. Eight volumes have been published. 


A complete list of publications will be sent upon request 


| 
e 


THE THEORY OF GROUP REPRESENTATIONS 
By Francis D. Murnaghan 


We have attempted to give a quite elementary and self-contained account 
of the theory of group representations with special reference to those groups 
(particularly the symmetric group and the rotation group) which have turned 
out to be of fundamental significance for quantum mechanics (especially 
nuclear physics). We have devoted particular attention to the theory of group 
integration (as developed by Schur and Weyl) ; to the theory of two-valued 
ofr spin representations; to the representations of the symmetric group and 
the analysis of their direct products; to the crystallographic groups; and to 
the Lorentz group and the concept of semi-vectors (as developed by Einstein 
and Mayer). —Extract from Preface. 


380 pages, 8vo, cloth, $5.00 


NUMERICAL MATHEMATICAL ANALYSIS 
By JAMES B. SCARBOROUGH 


“A valuable feature of the book is the excellent collection of examples at the end of 
each chapter. ... The book has many admirable features. The explanations and deriva- 
tions of formulae are given in detail. ... The author has avoided introducing new and 
complicated notations which, although they may conduce to brevity, are a serious 
stumbling block to the reader. The typography and paper are excellent.” 

—American Mathematics Monthly. 


430 pages, 25 figures, crown 8vo, buckram, $5.50 


TABLES OF V1—r? AND 1 —?* FOR USE IN PARTIAL 
CORRELATION AND IN TRIGONOMETRY 


By JOHN RICE MINER, Sc. D. 


These tables fill a want long felt by practical workers in all branches of statistics. 
Everyone who uses the method of correlation has wished for tables from which the 
probable error of a coefficient of correlation could be obtained with accuracy. Similar 
tables to this have existed on a small scale, but never before have there been available 
tables of 1 —r? and V1—+?’* to 6 places of decimals, and 4 places in the argument. 
Not only are these tables of great usefulness in getting the att error of a correlation 
coefficient, but also they have what will perhaps be their chief value in the calculations 
involved in the method of partial or net correlation. It is safe to say that these tables 
reduce the labor involved in this widely used statistical method by at least one-half. 


50 pages, 8vo, cloth, $1.50 


4 

/ 

/ 
/ 
! 
| 
| 
| / 
| 
| | 
| 
THE JOHNS HOPKINS PRESS - BALTIMORE ! 


| | | 


